什么是PostgreSQL? PostgreSQL是一个功能强大的开源对象关系型数据库系统,拥有超过35年的活跃开发历史。它以其可靠性、功能完整性和性能著称,被业界誉为”最先进的开源数据库”。
核心特性 1. 标准兼容性
完全符合SQL标准
支持ACID事务
支持复杂查询和子查询
2. 扩展性
支持自定义数据类型
支持自定义函数和操作符
支持存储过程和触发器
3. 高级功能
JSON和JSONB支持
全文搜索
地理空间数据支持(PostGIS)
分区表
并行查询
4. 可靠性
多版本并发控制(MVCC)
时间点恢复(PITR)
在线备份
流复制
数据类型 1. 基本数据类型 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 SMALLINT INTEGER BIGINT DECIMAL REAL DOUBLE PRECISION CHAR (n) VARCHAR (n) TEXT DATE TIME TIMESTAMP INTERVAL
2. 高级数据类型 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 JSON JSONB INTEGER [] TEXT[] POINT LINE POLYGON INET CIDR MACADDR
基本操作 1. 数据库操作 1 2 3 4 5 6 7 8 CREATE DATABASE mydb WITH OWNER postgres;DROP DATABASE mydb;\c mydb
2. 表操作 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 CREATE TABLE users ( id SERIAL PRIMARY KEY, username VARCHAR (50 ) UNIQUE NOT NULL , email VARCHAR (100 ) UNIQUE NOT NULL , age INTEGER CHECK (age >= 0 ), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP , updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ALTER TABLE users ADD COLUMN phone VARCHAR (20 );ALTER TABLE users ALTER COLUMN email TYPE VARCHAR (150 );CREATE INDEX idx_users_username ON users(username);CREATE UNIQUE INDEX idx_users_email ON users(email);DROP TABLE users;
3. 数据操作 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 INSERT INTO users (username, email, age) VALUES ('john_doe' , 'john@example.com' , 30 );INSERT INTO users (username, email, age) VALUES ('alice' , 'alice@example.com' , 25 ), ('bob' , 'bob@example.com' , 35 ); SELECT * FROM users WHERE age > 25 ;SELECT username, email FROM users ORDER BY created_at DESC LIMIT 10 ;UPDATE users SET age = 31 WHERE username = 'john_doe' ;DELETE FROM users WHERE age < 18 ;
高级查询 1. 连接查询 1 2 3 4 5 6 7 8 9 10 11 12 13 14 SELECT u.username, p.title, p.contentFROM users uINNER JOIN posts p ON u.id = p.user_id;SELECT u.username, COUNT (p.id) as post_countFROM users uLEFT JOIN posts p ON u.id = p.user_idGROUP BY u.username;SELECT * FROM table1FULL OUTER JOIN table2 ON table1.id = table2.id;
2. 子查询 1 2 3 4 5 6 7 8 9 10 11 12 SELECT username, (SELECT COUNT (* ) FROM posts WHERE user_id = users.id) as post_count FROM users;SELECT * FROM usersWHERE (age, username) IN (SELECT age, username FROM users_backup);SELECT * FROM usersWHERE EXISTS (SELECT 1 FROM posts WHERE user_id = users.id);
3. 窗口函数 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SELECT username, age, RANK () OVER (ORDER BY age DESC ) as age_rank, DENSE_RANK () OVER (ORDER BY age DESC ) as dense_rank FROM users;SELECT username, age, AVG (age) OVER () as avg_age, SUM (age) OVER (ORDER BY id) as cumulative_age FROM users;SELECT username, age, department, ROW_NUMBER () OVER (PARTITION BY department ORDER BY age) as dept_rank FROM users;
JSON操作 1. JSON数据类型 1 2 3 4 5 6 7 8 9 10 CREATE TABLE products ( id SERIAL PRIMARY KEY, name VARCHAR (100 ), attributes JSONB ); INSERT INTO products (name, attributes) VALUES ('Laptop' , '{"brand": "Dell", "specs": {"cpu": "i7", "ram": "16GB"}}' );
2. JSON查询 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SELECT * FROM products WHERE attributes- >> 'brand' = 'Dell' ;SELECT * FROM products WHERE attributes- > 'specs' - >> 'cpu' = 'i7' ;SELECT name, attributes- > 'specs' - >> 'ram' as ramFROM productsWHERE attributes @> '{"brand": "Dell"}' ;SELECT * FROM productsWHERE attributes- > 'tags' ? 'gaming' ;
3. JSON索引 1 2 3 4 5 CREATE INDEX idx_products_attributes ON products USING GIN (attributes);CREATE INDEX idx_products_brand ON products ((attributes- >> 'brand' ));
全文搜索 1. 创建全文索引 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CREATE TABLE articles ( id SERIAL PRIMARY KEY, title VARCHAR (200 ), content TEXT, search_vector tsvector ); CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);UPDATE articles SET search_vector = to_tsvector('english' , title || ' ' || content);
2. 全文搜索查询 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SELECT * FROM articlesWHERE search_vector @@ to_tsquery('english' , 'database & postgresql' );SELECT title, ts_rank(search_vector, to_tsquery('english' , 'postgresql' )) as rank FROM articlesWHERE search_vector @@ to_tsquery('english' , 'postgresql' )ORDER BY rank DESC ;SELECT title, ts_headline('english' , content, to_tsquery('postgresql' )) as highlighted FROM articlesWHERE search_vector @@ to_tsquery('postgresql' );
存储过程 1. 创建存储过程 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CREATE OR REPLACE FUNCTION get_user_posts(user_id INTEGER )RETURNS TABLE (post_title VARCHAR , post_date TIMESTAMP ) AS $$BEGIN RETURN QUERY SELECT p.title, p.created_at FROM posts p WHERE p.user_id = get_user_posts.user_id ORDER BY p.created_at DESC ; END ;$$ LANGUAGE plpgsql; SELECT * FROM get_user_posts(1 );
2. 带参数的存储过程 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 CREATE OR REPLACE FUNCTION create_user( p_username VARCHAR , p_email VARCHAR , p_age INTEGER ) RETURNS INTEGER AS $$ DECLARE new_id INTEGER ; BEGIN INSERT INTO users (username, email, age) VALUES (p_username, p_email, p_age) RETURNING id INTO new_id; RETURN new_id; END ;$$ LANGUAGE plpgsql; SELECT create_user('newuser' , 'new@example.com' , 25 );
触发器 1. 创建触发器 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CREATE OR REPLACE FUNCTION update_updated_at()RETURNS TRIGGER AS $$BEGIN NEW.updated_at = CURRENT_TIMESTAMP ; RETURN NEW ; END ;$$ LANGUAGE plpgsql; CREATE TRIGGER update_users_updated_at BEFORE UPDATE ON users FOR EACH ROW EXECUTE FUNCTION update_updated_at();
2. 审计触发器 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 CREATE TABLE user_audit ( id SERIAL PRIMARY KEY, user_id INTEGER , action VARCHAR (10 ), old_data JSONB, new_data JSONB, changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE OR REPLACE FUNCTION audit_user_changes()RETURNS TRIGGER AS $$BEGIN IF TG_OP = 'DELETE' THEN INSERT INTO user_audit (user_id, action, old_data) VALUES (OLD.id, 'DELETE' , row_to_json(OLD )::jsonb); RETURN OLD ; ELSIF TG_OP = 'UPDATE' THEN INSERT INTO user_audit (user_id, action, old_data, new_data) VALUES (OLD.id, 'UPDATE' , row_to_json(OLD )::jsonb, row_to_json(NEW )::jsonb); RETURN NEW ; ELSIF TG_OP = 'INSERT' THEN INSERT INTO user_audit (user_id, action, new_data) VALUES (NEW.id, 'INSERT' , row_to_json(NEW )::jsonb); RETURN NEW ; END IF; RETURN NULL ; END ;$$ LANGUAGE plpgsql; CREATE TRIGGER user_audit_trigger AFTER INSERT OR UPDATE OR DELETE ON users FOR EACH ROW EXECUTE FUNCTION audit_user_changes();
视图 1. 创建视图 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 CREATE VIEW active_users AS SELECT id, username, email, ageFROM usersWHERE status = 'active' ;CREATE VIEW user_statistics AS SELECT u.id, u.username, u.email, COUNT (p.id) as post_count, MAX (p.created_at) as last_post_date FROM users uLEFT JOIN posts p ON u.id = p.user_idGROUP BY u.id, u.username, u.email;
2. 物化视图 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CREATE MATERIALIZED VIEW monthly_stats AS SELECT DATE_TRUNC('month' , created_at) as month , COUNT (* ) as user_count, AVG (age) as avg_age FROM usersGROUP BY DATE_TRUNC('month' , created_at);REFRESH MATERIALIZED VIEW monthly_stats; CREATE INDEX idx_monthly_stats_month ON monthly_stats (month );
分区表 1. 范围分区 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CREATE TABLE sales ( id SERIAL, sale_date DATE , amount DECIMAL (10 ,2 ), customer_id INTEGER ) PARTITION BY RANGE (sale_date); CREATE TABLE sales_2024_q1 PARTITION OF sales FOR VALUES FROM ('2024-01-01' ) TO ('2024-04-01' ); CREATE TABLE sales_2024_q2 PARTITION OF sales FOR VALUES FROM ('2024-04-01' ) TO ('2024-07-01' );
2. 列表分区 1 2 3 4 5 6 7 8 9 10 11 12 13 CREATE TABLE customers ( id SERIAL, name VARCHAR (100 ), region VARCHAR (50 ) ) PARTITION BY LIST (region); CREATE TABLE customers_north PARTITION OF customers FOR VALUES IN ('North' , 'Northeast' , 'Northwest' ); CREATE TABLE customers_south PARTITION OF customers FOR VALUES IN ('South' , 'Southeast' , 'Southwest' );
并发控制 1. 事务 1 2 3 4 5 6 7 8 9 10 11 12 BEGIN ;UPDATE accounts SET balance = balance - 100 WHERE id = 1 ;UPDATE accounts SET balance = balance + 100 WHERE id = 2 ;COMMIT ;ROLLBACK ;
2. 锁 1 2 3 4 5 6 7 8 9 10 SELECT * FROM users WHERE id = 1 FOR UPDATE ;SELECT * FROM users WHERE id = 1 FOR SHARE;SELECT pg_advisory_lock(12345 );SELECT pg_advisory_unlock(12345 );
性能优化 1. 索引优化 1 2 3 4 5 6 7 8 9 10 11 CREATE INDEX idx_users_name_age ON users(username, age);CREATE INDEX idx_active_users ON users(username) WHERE status = 'active' ;CREATE INDEX idx_lower_username ON users (lower (username));CREATE INDEX idx_users_covering ON users(username, email, age);
2. 查询优化 1 2 3 4 5 6 7 8 9 10 11 EXPLAIN SELECT * FROM users WHERE username = 'john' ; EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25 ; WITH active_users AS ( SELECT * FROM users WHERE status = 'active' ) SELECT * FROM active_users WHERE age > 25 ;
3. 配置优化 1 2 3 4 5 6 7 8 9 10 shared_buffers = 256 MB effective_cache_size = 1 GB work_mem = 4 MB maintenance_work_mem = 64 MB max_connections = 100 max_worker_processes = 8 max_parallel_workers = 8
备份与恢复 1. 逻辑备份 1 2 3 4 5 6 7 8 pg_dump mydb > backup.sql pg_dump -t users -t posts mydb > backup.sql pg_dump mydb | gzip > backup.sql.gz
2. 恢复 1 2 3 4 5 psql mydb < backup.sql gunzip -c backup.sql.gz | psql mydb
3. 物理备份 1 2 pg_basebackup -D /backup/data -Ft -z -P
复制 1. 主从复制 1 2 3 4 5 6 7 wal_level = replica max_wal_senders = 3 wal_keep_segments = 64 primary_conninfo = 'host=master_ip port=5432 user=replicator password=secret'
2. 逻辑复制 1 2 3 4 5 6 7 CREATE PUBLICATION my_publication FOR TABLE users, posts;CREATE SUBSCRIPTION my_subscription CONNECTION 'host=master_ip port=5432 dbname=mydb user=replicator password=secret' PUBLICATION my_publication;
与Python集成 使用psycopg2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 import psycopg2import psycopg2.extrasconn = psycopg2.connect( host="localhost" , database="mydb" , user="postgres" , password="password" ) cur = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) cur.execute("SELECT * FROM users WHERE age > %s" , (25 ,)) users = cur.fetchall() cur.execute( "INSERT INTO users (username, email, age) VALUES (%s, %s, %s) RETURNING id" , ("newuser" , "new@example.com" , 30 ) ) user_id = cur.fetchone()['id' ] conn.commit() cur.close() conn.close()
使用SQLAlchemy 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 from sqlalchemy import create_engine, Column, Integer, String, DateTimefrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy.orm import sessionmakerengine = create_engine('postgresql://postgres:password@localhost/mydb' ) Base = declarative_base() class User (Base ): __tablename__ = 'users' id = Column(Integer, primary_key=True ) username = Column(String(50 ), unique=True ) email = Column(String(100 ), unique=True ) age = Column(Integer) created_at = Column(DateTime) Session = sessionmaker(bind=engine) session = Session() users = session.query(User).filter (User.age > 25 ).all () new_user = User(username='test' , email='test@example.com' , age=30 ) session.add(new_user) session.commit()
常见应用场景
企业级应用 :ERP、CRM系统
金融系统 :银行、保险、证券
电商平台 :商品管理、订单系统
内容管理 :CMS、博客系统
数据分析 :数据仓库、报表系统
优缺点分析 优点
功能完整,符合标准
扩展性强
性能优秀
社区活跃
开源免费
缺点
配置复杂
内存占用较高
学习曲线陡峭
某些操作性能不如MySQL
学习资源推荐
总结 PostgreSQL作为企业级关系型数据库,凭借其强大的功能、优秀的性能和良好的扩展性,成为企业级应用的首选。无论是复杂的事务处理、数据分析还是地理空间应用,PostgreSQL都能提供完整的解决方案。对于需要高可靠性和功能完整性的企业应用,PostgreSQL是最佳选择之一。