# postgres
super user account is `postgres`
When first creating a server instance the only database that will be in the server is also `postgres`
## Resources
- [Official Documents][https://www.postgresql.org/docs/current/index.html]
## Schemas
```sql
CREATE SCHEMA rpt AUTHORIZATION postgres;
COMMENT ON SCHEMA rpt IS 'Reporting';
GRANT ALL PRIVILEGES ON SCHEMA rpt TO postgres;
ALTER USER postgres SET search_path TO rpt, public;
```
## Setup
### Create database
Create server instance > right click on databases > new database
### Create Tables
Expand your databse > Expand Schemas > Tables > Create > table creation GUI
#### Constraints and more
When in the column menu of the table GUI you can select the edit icon next to any column
and it will open a sub-menu with items that allow you to do things like add Constraints
## Stores Procedure
```sql
CREATE OR REPLACE FUNCTION rpt."FN_Nuke_From_Orbit"() RETURNS void LANGUAGE 'plpgsql' AS $
BEGIN
TRUNCATE TABLE rpt.report_data;
TRUNCATE TABLE rpt.report_data_clean;
TRUNCATE TABLE rpt.location_trended;
TRUNCATE TABLE rpt.location_top;
END;
$;
ALTER FUNCTION rpt."FN_Nuke_From_Orbit"() OWNER TO postgres;
COMMENT ON FUNCTION rpt."FN_Nuke_From_Orbit"()
IS 'Wipe all data in the rot schema from the Face of the earth';
CREATE OR REPLACE PROCEDURE rpt."USP_Refresh"() LANGUAGE 'plpgsql' AS $
BEGIN
/*********************************************************
USAGE:
This stored procedure can be run at any interval desired.
Since this performs a full truncate and load of the entire
rpt schema and all table data this will be expensive
in resources.
Recommendation is, depending on geographic factors of
store locations affecting end of day accounting totals,
to run the stored procedure in the off hours such as
at midnight or directly after C.O.B. so the day's
results are posted immediately after the conclusion
of that day's business.
*********************************************************/
/*********************************************************
Wipe The entire structure of the rpt schema for a clean
full rebuild via truncate and load ETL
*********************************************************/
PERFORM rpt."FN_Nuke_From_Orbit"();
/*********************************************************
Grab all raw data and kick off the live rebuild process
driven by triggers.
*********************************************************/
INSERT INTO rpt.report_data
SELECT p.payment_date
, a.address
, p.amount
FROM public.payment AS p
LEFT JOIN public.staff AS S ON s.staff_id = p.staff_id
LEFT JOIN public.store AS st ON st.store_id = s.store_id
LEFT JOIN public.address AS a ON a.address_id = st.address_id;
COMMIT;
END;
$;
COMMENT ON PROCEDURE rpt."USP_Refresh"()
IS 'Refresh the entire reporting structure';
```
## Tables
```sql
CREATE TABLE rpt.report_data
(
"payment_date" timestamp without time zone NOT NULL,
"address" character varying (255) NOT NULL,
"amount" numeric
);
ALTER TABLE rpt.report_data OWNER to postgres;
```
## Triggers
```sql
CREATE OR REPLACE FUNCTION rpt.FN_Clean_Data() RETURNS trigger LANGUAGE 'plpgsql' AS $
BEGIN
INSERT INTO rpt.report_data_clean
SELECT DATE_PART('year', payment_date) AS Year
, address AS Location
, CAST(amount AS money) AS Revenue
FROM rpt.report_data;
RETURN NULL;
END;
$;
ALTER FUNCTION rpt.FN_Clean_Data() OWNER TO postgres;
COMMENT ON FUNCTION rpt.FN_Clean_Data()
IS 'New data in rpt.report_data gets cleaned and inserted into rpt.report_data_clean';
CREATE TRIGGER TR_ETL AFTER INSERT ON rpt.report_data
FOR STATEMENT
EXECUTE FUNCTION rpt.FN_Clean_Data();
COMMENT ON TRIGGER TR_ETL ON rpt.report_data
IS 'Update reports when new data is added to the rpt.report_data table';
/*========================================================================*/
CREATE OR REPLACE FUNCTION rpt.FN_ETL() RETURNS trigger LANGUAGE 'plpgsql' AS $
BEGIN
TRUNCATE TABLE rpt.location_trended;
INSERT INTO rpt.location_trended
SELECT "Year", "Location", SUM("Revenue")
FROM rpt.report_data_clean
GROUP BY "Year", "Location";
TRUNCATE TABLE rpt.location_top;
INSERT INTO rpt.location_top
SELECT "Year", "Location", SUM("Revenue")
, RANK() OVER(ORDER BY "Year" DESC, SUM("Revenue") DESC) AS "Rank"
FROM rpt.report_data_clean
WHERE "Year" IN (
-- CAST(DATE_PART('year', NOW()) AS INT) -- Current Year
-- , CAST(DATE_PART('year', NOW()) AS INT) - 1 -- Prior Year
(SELECT MAX("Year") FROM rpt.report_data_clean)
, (SELECT MAX("Year") - 1 FROM rpt.report_data_clean)
)
GROUP BY "Year", "Location";
RETURN NULL;
END
$;
ALTER FUNCTION rpt.FN_ETL() OWNER TO postgres;
COMMENT ON FUNCTION rpt.FN_ETL()
IS 'New data in rpt.report_data_clean so update all reports';
CREATE TRIGGER TR_Update_Reports AFTER INSERT ON rpt.report_data_clean
FOR STATEMENT
EXECUTE FUNCTION rpt.FN_ETL();
COMMENT ON TRIGGER TR_Update_Reports ON rpt.report_data_clean
IS 'Update reports when new data is added to the rpt.report_data_clean table';
```
## Data Types
### Numeric
#### Serial
preferred for auto-incrementing values like primary keys
### Array
#### Array data
```sql
SELECT ARRAY['Lemon', 'Bat Limited Edition' ] AS example_purchased_products;
/*
example_purchased_products
------------------------------
{Lemon, "Bat Limited Edition"}
*/
```
#### Unnest array
```sql
SELECT UNNEST(ARRAY[123, 456, 789]) AS example_ids;
```
![[2022-01-26-12-29-12.png]]
This is a great way to add tags to an item in a single column in a better way than using string split on csv values!
#### Array aggregate
```sql
SELECT product_type, ARRAY_AGG(DISTINCT model) AS models FROM products GROUP BY 1;
```
![[2022-01-26-12-31-31.png]]
#### String splitting to array
```sql
SELECT STRING_TO_ARRAY('hello there how are you?', ' ');
```
![[2022-01-26-12-32-26.png]]
### Casting
unlike [[TSQL]] casting doesnt require the `CAST(column AS TYPE)` function. Instead, within postgres you can cast datatypes like `column::datatype` using the double colon syntax `::`
```sql
SELECT product_id
, model
, year::TEXT
, product_type
, base_msrp
, production_start_date
, production_end_date
FROM products;
```
## Functions
### Coalesce
> To illustrate a simple usage of the COALESCE function, let's return to the customers table. Let's say the marketing team would like a list of the first names, last names, and phone numbers of all male customers. However, for those customers with no phone number, they would like the table to instead write the value 'NO PHONE'. We can accomplish this request with `COALESCE`:
```sql
SELECT first_name
, last_name
, COALESCE(phone, 'NO PHONE') AS phone
FROM customers
ORDER BY 1;
```
> When dealing with creating default values and avoiding `NULL`, `COALESCE` will always be helpful.
### Copy
> The COPY statement retrieves data from your database and dumps it in the file format of your choosing. For example, take the following statement:
#### Data To STDOUT
```sql
COPY (SELECT * FROM customers LIMIT 5) TO STDOUT WITH CSV HEADER;
```
- `COPY` is simply the command used to transfer data to a file format.
- `(SELECT * FROM customers LIMIT 5)` is the query that we want to copy.
- `TO STDOUT` indicates that the results should be printed rather than saved to a file on the hard drive. "Standard Out" is the common term for displaying output in a command-line terminal environment.
- `WITH` is an optional keyword used to separate the parameters that we will use in the database-to-file transfer.
- `CSV` indicates that we will use the CSV file format. We could have also specified `BINARY` or left this out altogether and received the output in text format.
- `HEADER` indicates that we want the header printed as well.
#### Data to a file
```sql
COPY (SELECT * FROM customers LIMIT 5) TO '/path/to/my_file.csv' WITH CSV HEADER;
```
#### Use psql CLI to get data into the database
```bash
psql -h my_host -p 5432 -d my_database -U my_username
\copy (SELECT * FROM customers LIMIT 5) TO 'my_file.csv' WITH CSV HEADER:
```
- `\copy` is invoking the Postgres `COPY` ... `TO STDOUT`... command to output the data.
- `(SELECT * FROM customers LIMIT 5)` is the query that we want to copy.
- `TO 'my_file.csv'` indicates that psql should save the output from standard into my_file.csv.
- The `WITH CSV HEADER` parameters operate the same as before.
##### Configuring the copy command
- `DELIMITER` 'delimiter_character' can be used to specify the delimiter character
for CSV or text files (for example for CSV files, or '`|`' for pipe-separated files)
##### Loading data into a table
```bash
\copy customers FROM 'my_file.csv' CSV HEADER DELIMITER
```
### Least Greatest
Unlike aggregate functions such as `MIN` or `MAX` using `LEAST` or `GREATEST` seems to be scalar oriented
> Two functions that come in handy for data preparation are the `LEAST` and `GREATEST` functions. Each function takes any number of values and returns the least or the greatest of the values, respectively.
>
> A simple use of this variable would be to replace the value if it's too high or low. For example, the sales team may want to create a sales list where every scooter is $600 or less than that. We can create this using the following query:
```sql
SELECT product_id
, model
, year
, product_type
, LEAST(600.00, base_msrp) AS base_msrp
, production_start_date
, production_end_date
FROM products
WHERE product_type='scooter'
ORDER BY 1;
```
### Now
#### Details
`NOW()` is Like `GETDATE()` in [[TSQL]] returns a `DATETIME` value but also with the
relevant timezone info if casted to that.
#### Usage
```sql
SELECT NOW();
```
### Window Functions
Examples of window functions
```sql
SELECT {columns}
, {window_func} OVER (PARTITION BY {partition_key} ORDER BY {order_key})
FROM table1;
```
more realistic example:
```sql
SELECT customer_id
, title
, first_name
, last_name
, gender
, COUNT(*) OVER (PARTITION BY gender ORDER BY customer_id) AS total_customers
, SUM(CASE WHEN title IS NOT NULL THEN 1 ELSE 0 END) OVER (PARTITION BY gender ORDER BY customer_id) AS total_customers_title
FROM customers
ORDER BY customer_id;
```
The window Keyword:
```sql
SELECT customer_id
, title
, first_name
, last_name
, gender
, COUNT(*) OVER w AS total_customers,
, SUM(CASE WHEN title IS NOT NULL THEN 1 ELSE 0 END) OVER w AS total_customers_title
FROM customers
WINDOW w AS (PARTITION BY gender ORDER BY customer_id) -- Reduces typing and improves legibility
ORDER BY customer_id;
```
## Indexes
```sql
CREATE UNIQUE INDEX "CIX_Year_Loc"
ON rpt.location_trended USING btree
("Year" ASC NULLS LAST, "Location" COLLATE pg_catalog."default" ASC NULLS LAST)
INCLUDE ("Year", "Location")
TABLESPACE pg_default;
ALTER TABLE rpt.location_trended CLUSTER ON "CIX_Year_Loc";
```
## Tips and Tricks
### date intervals
```sql
SELECT TIMESTAMP '2016-03-01 00:00:00' + INTERVAL '7 days' AS new_date;
/*
new_date
-------------------
2016-03-08 00:00:00
(1 row)
*/
```
### extract date parts
```sql
SELECT current_date
, EXTRACT(dow FROM current_date) AS day_of_week
, EXTRACT(week FROM current_date) AS week_of_year
, EXTRACT(quarter FROM current_date) AS quarter;
/*
current_date | day_of_week | week | quarter
-------------+-------------+------+--------
2019-04-28 | 0 | 17 | 2
(1 row)
*/
```
### query tuning with explain
```sql
EXPLAIN(FORMAT JSON)
SELECT *
FROM public.test
WHERE hello IN (SELECT hello FROM public.test ORDER BY hello DESC LIMIT 1);
```
![[2022-01-26-10-01-27.png]]
Using `EXPLAIN(ANALYZE)` and then pasting the output of the plan into this tool gives like a flame graph:
<https://explain.depesz.com/>
![[2022-01-26-10-09-32.png]]
This tool is also amazing for displaying graphics on a plan
<https://explain.dalibo.com/>
![[2022-01-26-10-30-43.png]]
Setting EXPLAIN to be done automatically (must be a super user to load the module)
```sql
LOAD 'auto_explain';
```
`Explain` will just provide the plan calculations
Running `Analyze` as well will actually execute the statement so if running on a `DELETE` statement be careful if just testing
### returning data you touched
```sql
INSERT INTO person (name) VALUES ('Groot') RETURNING id;
/*
id
----
4
(1 row)
*/
```
### returning newly inserted record
#### Return Newly Inserted Record
When inserting a new record into a table, if you also want that record immediately
returned to the output viewer after insertion just add the `RETURNING` keyword
##### Before
```sql
INSERT INTO products ( col1, col2, col3 ) VALUES ( val1, val2, val3 );
```
##### After
```sql
-- This returns all fields of the newly inserted record but you can also make
-- the returned value something like just the new PK ID of the record
INSERT INTO products ( col1, col2, col3 ) VALUES ( val1, val2, val3 ) RETURNING *;
```
### writeable cte
![[2022-01-26-10-39-53.png]]
```sql
-- Delete rows while simultaneously
-- inserting them elsewhere
WITH moved_rows AS (
DELETE FROM products
WHERE date >= '2010-10-01' AND date < '2010-11-01'
RETURNING *
)
INSERT INTO products_log SELECT * FROM moved_rows;
```
## pgagent and pgadmin
### pgadmin
#### Setup
PGAdmin Stores your passwords for all Postgres server instances.
Password provided for this is not the same as the installation password for Postgres
### pgagent
Reference: <https://severalnines.com/database-blog/overview-job-scheduling-tools-postgresql>
Like [[sql-agent]] it is a job scheduling agent available for PostgreSQL that allows the execution of stored procedures, SQL statements, and shell scripts.
The purpose is to have this agent running as a daemon on Linux systems and periodically does a connection to the database to check if there are any jobs to execute.
This scheduling is easily managed by PgAdmin 4, but it’s not installed by default once the pgAdmin installed, it’s necessary to download and install it on your own.
#### Step 1
Installation of pgAdmin 4
```bash
sudo apt install pgadmin4 pgadmin4-apache
```
#### Step 2
Creation of plpgsql procedural language if not defined
```sql
CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_ handler
HANDLER plpgsql validator;
```
#### Step 3
Installation of pgAgent
```bash
sudo apt-get install pgagent
```
#### Step 4
Creation of the pgagent extension
```sql
CREATE EXTENSION pageant
```
In order to define a new job, it's only necessary select "Create" using the right
button on "pgAgent Jobs", and it'll insert a designation for this job and define the
steps to execute it:
![[2022-01-14-23-43-37.png]]
![[2022-01-14-23-44-04.png]]
![[2022-01-14-23-44-19.png]]
#### Step 5
Finally, to have the agent running in the background it's necessary to launch the
following process manually:
```bash
/usr/bin/pgagent host=localhost dbname=postgres user=postgres port=5432 -l 1
```
## workflow
### Adding Data To a Table manually
**When** you select values from a table you can edit the output and add new data
once you want to "commit" the data you click the icon to save the data to the table:
![[2022-01-10-14-30-12.png]]
### migrating databases
To migrate a database:
1. Right click on the database on the source server
2. Backup
3. Save the file as *.sql
4. Create a new database on the target server
5. Right click the new database and select `restore`
6. Choose the *.sql file you made to restore from