Tuesday, December 25, 2007

New Postgres Online Magazine

I ran across a new Postgres ezine, Postgres Online Journal. Dec 2007 is complete and Jan 2008 is being constructed. You can read the current, under construction issue as HTML while it is being constructed. It looks like you can also download a PDF when an issue is complete. It will be nice to have an additional resource for Postgres developers and DBAs. This is just a sign that Postgres is becoming more mainstream. In the past all online Postgres information came from a very small group of people. The Dec 2007 issue has some good info:

  • PostgreSQL The Road Behind and Ahead
  • PostgreSQL 8.3 is just around the Corner
  • Converting from Unix Timestamp to PostgreSQL Timestamp or Date Beginner
  • Using Distinct ON to return newest order for each customer Intermediate
  • How to create an index based on a function Intermediate
  • The Anatomy of a PostgreSQL - Part 1 Beginner
  • How does CLUSTER ON improve index performance Intermediate
  • Language Architecture in PostgreSQL Intermediate
  • PostGIS for geospatial analysis and mapping Intermediate
  • Database Abstraction with Updateable Views Advanced
The article on updateable views is very good. I have used updateable views A LOT in Oracle but hadn't given that any thought in Postgres. DISTINCT ON is a nice feature also. You can click on About The Authors to see who is writing this. It leads to a company site called Paragon Corporation. They appear to be a database consulting company that works with most databases. This is a nice addition to the Postgres online resource pool. LewisC

Sunday, October 7, 2007

Best Way to Handle No Data Found in a Procedure?

When it comes to data issues (too many rows, no data found, etc), in Oracle stored procedures, I am used to having exceptions raised that I then handle. PL/pgSQL does not raise exceptions for the same conditions in the same way. The Postgres docs are pretty complete though and through some reading this weekend, I discovered a new keyword. For example, assuming that I have this table (which is empty) in both Oracle and Postgres:

CREATE TABLE empty_table
(
  empty_col integer
);
In Oracle this procedure:
CREATE OR REPLACE PROCEDURE no_data_found_test
AS
  v_int_field INTEGER;
BEGIN

  SELECT empty_col
    INTO v_int_field
    FROM empty_table;

END;
When run:
BEGIN
  no_data_found_test;
END;
Produces an error:
Error starting at line 1 in command:
BEGIN
  no_data_found_test;
END;
Error report:
ORA-01403: no data found
ORA-06512: at "HR.NO_DATA_FOUND_TEST", line 6
ORA-06512: at line 2
01403. 00000 -  "no data found"
*Cause:    
*Action:
However, the equivalent procedure in Postgres:
CREATE OR REPLACE FUNCTION no_data_found_test()
  RETURNS void AS
$BODY$
DECLARE
  v_int_field INTEGER;
BEGIN

  SELECT empty_col
    INTO v_int_field
    FROM empty_table;

END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
Does not raise an exception when run:
postgres=# select * from no_data_found_test();
 no_data_found_test
--------------------

(1 row)
I have been using the FOUND variable to check for a result and raise an exception manually if no data was found. Like this:
CREATE OR REPLACE FUNCTION no_data_found_test()
  RETURNS void AS
$BODY$
DECLARE
  v_int_field INTEGER;
BEGIN

  SELECT empty_col
    INTO v_int_field
    FROM empty_table;

  IF NOT FOUND THEN
    raise exception 'NO_DATA_FOUND';
  END IF;

END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
When run, I do get an exception like I was expecting.
postgres=# select * from no_data_found_test();
ERROR:  NO_DATA_FOUND
However, after a little bit more reading, I see that I can add the STRICT keyword to make the procedure behave like Oracle:
CREATE OR REPLACE FUNCTION no_data_found_test()
  RETURNS void AS
$BODY$
DECLARE
  v_int_field INTEGER;
BEGIN

  SELECT empty_col
    INTO STRICT v_int_field
    FROM empty_table;

END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
And now when run, it does raise the exception:
postgres=# select * from no_data_found_test();
ERROR:  query returned no rows
CONTEXT:  PL/pgSQL function "no_data_found_test" line 5 at SQL statement
postgres=#
Very Nice! I think I prefer this method. It would be nice if STRICT were a database wide configuration parameter. LewisC

Friday, October 5, 2007

Postgres LiveCD

I read about the Postgres LiveCD at the Blue Gnu. That's such a good idea! I don't know why we don't see more LiveCDs. Even commercial software should be taking advantage of such an easy marketing opportunity. The entry says:

Well, what can I say? The CD is simply Xubuntu with PostgreSQL loaded up and ready to run. And once loaded, you can actually play with PostgreSQL at least three different ways. The Xubuntu CD offers three icons that launch the Command-Line client, PGAdmin3 or phpPGAdmin. Now you can play with the server for as long as you like, goof it up and reboot for a clean, shiny new server. Actually, the CD includes some sample databases and Slony-I, the database replication utility. Now, I didn't notice any documentation lying around, but my guess is one could probably mount the local hard drive and have PostgreSQL create databases on it all day long without ever installing the actual database server. Xubuntu still contains (most of) the other applications normally included - the team has simply loaded PG and it's client applications.
I couldn't find any links about where to get an ISO or anything so I went searching. I found some email list messages from 2006 and two interesting links: As a bonus, Dru links to Joshua Drake's Practical PostgreSQL. This free book is getting a little old but still makes a useful reference. LewisC

Thursday, October 4, 2007

Postgres Has Exceptional Documentation

Yesterday, Postgres made it to Mashable, today it got a mention in E-Commerce Times, Open Source for Business: Now More Than Ever, Part 1. The article isn't about Postgres specifically. It's about business targeted OSS uptake in the enterprise. There's an "open source roundup" "based on the recommendations of business leaders". The Postgres entry:

Postgresql: PostgreSQL is an open source relational database system. It runs on all major operating systems, including Linux, Unix (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64) and Windows. It is fully ACID (Atomicity, Consistency, Isolation, Durability) compliant, has full support for foreign keys, joins, views, triggers and stored procedures (in multiple languages). It includes most SQL92 and SQL99 data types, including Integer, Numeric, Boolean, Char, Varchar, Date, Interval and Timestamp. It also supports storage of binary large objects, including pictures, sounds or video. It has native programming interfaces for C/C++, Java Latest News about Java, .Net, Perl, Python, Ruby, Tcl and ODBC, among others, and exceptional documentation.

Wednesday, October 3, 2007

PostgreSQL Site Makes it to Mashable

Mashable posted a SQL TOOLBOX: 20+ Tools and Lessons About SQL. While there are a lot more links to MySQL, at least Postgres made the list. The description isn't exactly the best I have seen though, "PostgreSQL.org - An open source database system developed on SQL." I've been to some of these sites and they aren't exactly the best around. Some are old. There are a couple of really good ones though. The Comparison of different SQL implementations seems pretty well maintained and covers Oracle, Postgres, MySQL, DB2, & SQL Server. Check it out. LewisC

Saturday, September 29, 2007

Hiding SQL in a Stored Procedure

I'm sure you've heard that it is a bad practice to embed SQL in your applications and that all direct SQL access should be encapsulated (or hidden) in stored procedures. I've had people ask me exactly what that means so below is a very simple example of that. I am encapsulating a single table. This is really not that beneficial. Where this gets very powerful is when you are joining many tables or performing complex logic. You can use this method to hide schema complexity from any application that can query data. First I create a simple table and populate it:


create table test_data (
 name text,
 address text,
create_date timestamp );
insert into test_data values (
         'lewis',
        '123 abc st',
        timestamp '2001-01-01 10:00:00');
insert into test_data values (
        'george',
        '456 def dr',
        timestamp '2091-01-01 10:00:00');
postgres=# select * from test_data;
 name  |  address   |     create_date
--------+------------+---------------------
lewis  | 123 abc st | 2001-01-01 10:00:00
george | 456 def dr | 2091-01-01 10:00:00
(2 rows)
Next I create a very simple function. This function accepts two timestamps and returns any records that have a creation data falling within those dates.

CREATE OR REPLACE FUNCTION get_data_by_creation(
     timestamp without time zone, 
     timestamp without time zone)
RETURNS SETOF test_data 
AS
$$ 
  SELECT name, address, create_date
    FROM test_data
    WHERE create_date >= $1
      AND create_date <= $2;
$$
LANGUAGE 'sql' VOLATILE;
And then I select from the function passing in different values as input:

postgres=# select * 
             from get_data_by_creation(
               localtimestamp - interval '10 years', 
               localtimestamp);
 name  |  address   |     create_date
-------+------------+---------------------
 lewis | 123 abc st | 2001-01-01 10:00:00
(1 row)

postgres=# select * 
             from get_data_by_creation(
               localtimestamp, 
               localtimestamp + interval '100 years');
  name  |  address   |     create_date
--------+------------+---------------------
 george | 456 def dr | 2091-01-01 10:00:00
(1 row)
As I said above, this is a very simple example. The usefulness becomes very apparent when you are trying to hide the complexities of your schema from application programmers and users.

Saturday, September 15, 2007

Programming Postgres

This is just a quick overview of the availability of programming for Postgres. With Oracle, your choice in the database is Java or PL/SQL. In general, PL/SQL is the preferred language. One of the things that attracted me to Postgres was the ability to choose one of many languages to program in. I usually choose PL/pgSQL but if I can't do something with that, it's nice to be able to fall back to TCL, Perl or one of the many other languages supported by Postgres. Here is a (incomplete, I'm sure) list of supported languages:

  • PL/pgSQL - Standard Procedural Language for Postgres
  • PL/TCL - TCL, Tool Command Language, easy to learn but powerful
  • PL/Perl - Practical Extraction and Report Language, has grown way beyond its origins
  • PL/Java - Java, meh
  • PL/PHP - PHP: Hypertext Preprocessor, over hyped language
  • PL/Python - Object Oriented, dynamic language. Almost as over hyped as PHP
  • PL/R - Statistical Language, never used it.
  • PL/Ruby - Great language. Over hyped maybe but is my current favorite open source language
  • PL/Scheme - Lispy language, never used it
  • PL/sh - Access to your OS shell language
Adding a new language to your existing database is very easy. You can get the details from the Postgres CREATE LANGUAGE syntax docs. The way you will most often add a language is: CREATE LANGUAGE For example: CREATE LANGUAGE pltcl; The language must already be installed in the OS. You will use the dynamic libraries from the language to create and run your programs. You also need the Postgres component for your language. PL/pgSQL, PL/TCL and PL/Perl come with the standard distribution of Postgres. You can get other languages at http://pgfoundry.org/. Run a search on pgFoundry to see what they have.

Sunday, September 9, 2007

Foxpro to Postgres Data Converter

Do you remember Foxpro? xBase? Clipper? I'm an old clipper head from way back. Anyway, I ran across something in my recent internet travels that caught my fancy. It may have the longest name of any utility I have ever seen. FoxPro PostgreSQL Import, Export & Convert Software will transfer tables to and from PostgreSQL and FoxPro. I haven't downloaded it yet so I can say how well it works but it's a neat idea. I'm guessing it can handle DBF files so that would make it compatible with dBase, Foxpro, Clipper and may other tools. It's a shareware program and I don't see a price. It's been a while since I even ran across shareare. That's a flashback for me too. I use to write and sell shareware. Anyway, here's the blurb for it:

This application will allow you to transfer tables to and from PostgreSQL and FoxPro databases. Importing and exporting can be done without SQL knowledge. This program saves time if you need to transfer tables between the two databases quickly. Connect via DBC file or multiple DBF files in a folder.
Take care, LewisC

Friday, September 7, 2007

Calling a Procedure or Function in Postgres

I think PL/SQL programmers who move to Postgres all run into the same thing, how do I run the procedure or function once I've created it?

Obviously, it's easy when you are calling it from another stored procedure or function.  What most mean, is how do I call it from the command line?  That's easy too.

Using my function and procedure from earlier posts, i.e. 10 Steps to Creating a Function In PostgreSQL Using PLpgSQL and Creating a Procedure In PostgreSQL Using PLpgSQL.

To call either the function or the procedure, you can run:

SELECT func_1();

Or

SELECT * FROM func_1();

You must include the ().

Same for the procedure.

SELECT proc_1();

Or

SELECT * FROM proc_1();

Of course, if you are using a GUI tool like PgAdmin III, it's a somewhat different story.  You need to open the query window but the syntax will be the same.

Technorati Tags: , , , , , , , ,

Wednesday, September 5, 2007

Creating a Procedure In PostgreSQL Using PLpgSQL

I recently wrote about creating a function in PL/pgSQL, 10 Steps to Creating a Function In PostgreSQL Using PLpgSQL.

Today, I am going to show you how to create a procedure.  You don't really create a procedure, you use the same basic syntax as you do for a function.  The RETURNS keyword can get a little tricky as that keyword requirement changes based on what you are trying to do.  To start, I will create a procedure that functions the same as the function in the previous entry.  Then I will make some changes.

CREATE OR REPLACE FUNCTION proc_1(
  OUT out_parameter CHAR VARYING(25) )
AS $$
DECLARE
BEGIN
  SELECT datname
    INTO out_parameter
    FROM pg_database
    LIMIT 1;

  RETURN;
END;
$$ LANGUAGE plpgsql;

If you compare this to the function from the previous entry some things jump out right away.  There is no RETURNS statement.  Postgres is able to determine the return type from the OUT parameter.

The DECLARE keyword is completely optional.

The RETURN keyword has no operand.  I don't need to RETURN a variable because the OUT parameter will be the return value.  As a matter of a fact, if I try to RETURN a value, I will get an error.

Below is another procedure that does basically the same thing but does not return a value.  I am showing you this just to show you how to declare a procedure that truly acts as a procedure (that does not return any values).

CREATE OR REPLACE FUNCTION proc_3()
RETURNS void
AS $$
DECLARE
  local_char_var CHAR(30);
BEGIN
  SELECT datname
    INTO local_char_var
    FROM pg_database
    LIMIT 1;

  RETURN;
END;
$$ LANGUAGE plpgsql;

Notice in this procedure that I am declare the return value as RETURNS void.  If you are a C, Java or C# programmer this should be very familiar.

For a PL/SQL programmer, just consider that a FUNCTION RETURNS VOID is the same as a PROCEDURE in PL/SQL.

That's it for this post.  If you have anything specific that you would like me to cover, please leave a comment or drop me an email.

Thanks,

LewisC

 

Monday, September 3, 2007

Should PostgreSQL be renamed and of so, to what?

For, I guess, about the last week or so, there has been an ongoing debate on the Advocacy email list about renaming the software from PostgreSQL to Postgres or something else.

Some people have very strong opinions about it. Personally, I don't see what the big deal is. The only people who really have an issue are the people in that (hacker) community. CEOs don't care what it's called. No developer I know cares what it's called.

Just to clarify, when I say developer, I mean a person who develops software for business use. Not someone who actually writes code for the database itself (those are the hackers).

So what I mean, is that the guy writing a GL, using PostgreSQL, doesn't care if it's PostgreSQL, Postgres, Postres QL, pg, pgsql or anything else. They just want the software to work.

At the beginning of the discussion on the mail list someone complained about people mis-pronouncing the name, PostgreSQL. Que? How can you mis-pronounce a made up name? It's an imaginary word. If I want to pronounce it "frank" am I wrong? Maybe in your universe but not in mine.

Many people have suggested to just say that Postgres is the preferred spelling (and pronunciation) and PostgreSQL is less preferred but acceptable spelling. I would have to agree with that one. If the pronunciation really bothers someone, let them write a blog about it. C'est la vie.

LewisC


Sunday, September 2, 2007

10 Steps to Creating a Function In PostgreSQL Using PLpgSQL

It's actually fairly easy to create a function using PLpgSQL, especially if you are coming from a database background like Oracle or DB2. Both have procedural languages that look a lot like PLpgSQL. I'll go ahead and show you the code for a very basic function and then I'll explain the steps individually.

CREATE OR REPLACE FUNCTION func_1()
  RETURNS CHAR VARYING(25) 
AS $$
DECLARE
  local_char_var CHAR(30);
BEGIN
  SELECT datname
    INTO local_char_var
    FROM pg_database
    LIMIT 1;

  RETURN local_char_var;
END;
$$ LANGUAGE plpgsql;
Ok. Now we'll go through it line by line:
  1. CREATE OR REPLACE FUNCTION func_1() - This line creates and names the function. The "OR REPLACE" will let us modify the function without fir DROPping it. func_1 can be any valid PLpgSQL name. Even if you are not declaring parameters, you must include the parenthesis ().
  2. RETURNS CHAR VARYING(25) - RETURNS is the keyword that signifies what data type the function will be returing. RETURN is the equivalent in PL/SQL. Notice the S. CHAR VARYING is the equivalent of a VARCHAR2 in PL/SQL. In this case it will be a VARCHAR2(25).
  3. AS $$ - AS is the same as the PL/SQL AS. The $$ is the function code delimiter. You can actually use single quotes here instead of $$ but in that case you would need to double all the quotes in your code. If that doesn't make sense, just consider the $$ mandatory. PLpgSQL is a more basic language than PL/SQL. The code is stored as text and compiled as it is run. The entire function body is just a string and the $$ is the delimiter.
  4. DECLARE - used to begin the variable declaration area. If you are not declaring local variables, this is an optional keyword.
  5. local_char_var CHAR(30); - A locally declared variable.
  6. BEGIN - Begins the body of the the function.
  7. SELECT datname INTO local_char_var FROM pg_database LIMIT 1; - This select statement is selecting the database name from the PG_DATABASE data dictionary table. The database name is being stored in the local variable local_char_var. The query is limiting the result set to 1 row.
  8. RETURN local_char_var; - This line returns the local variable back to the calling program.
  9. END; - Ends the body of the function.
  10. $$ LANGUAGE plpgsql; - The $$ is the end delimiter of the function body (think of it as ending the string). The LANGUAGE plpgsql identifies the language type to the postgres engine so that it knows which language to run the program.
That's it for a very basic function. I'll build on this in the future. LewisC

Friday, August 31, 2007

PostgreSQL 8.3 Features To Be Available After Labor Day

InformationWeek has an article with Bruce Momjian, PostgreSQL 8.3 Features To Be Available After Labor Day: Full text search and other features will become available for free download in beta code after Labor Day; with the final release to follow by 8-10 weeks. Won't be long now. I probably won't have time to play with it while it's in beta but I am looking forward to it. Tsearch2 will be nice and compares to Oracle Text although not quite a feature rich. For some reason, EnterpriseDB's PL/SQL debugger is being included. That doesn't make a lot of sense to me as they are not releasing SPL (their version of PL/SQL) to be included. I guess it's being included because it also supports PL/pgSQL (the postgres procedural language). Bitmap indexes are not making it into this release. That would have been nice for larger installations like data warehouses. Not mentioned in the article but will be there is SQL/X support. I am a huge user of XML and SQL/X makes life so much easier when generating XML. I wonder when XQuery will be supported?

Tuesday, August 28, 2007

Nice Little Cheat Sheet

I stumbled across a nice little cheat sheet by someone names Pete Freitag. It's a PostgreSQL Cheat Sheet. The sheet contains a quick ref for CREATE DATABASE, CREATE TABLE, Adding a primary key, Creating an INDEX, Backing up a database, Running a SQL script, Selecting using a regular expression, Selecting the first N records, Using Prepared Statements, Creating a Function, Vacuum, Reindexing, Showing a query plan, Importing from a file, and some basic SQL statements. It's not a bad quick ref for newbies. The site also has cheat sheets for Coldfusion, Java, Apache, ASCII codes, CSS, XHTML, and even an English Grammer Cheat Sheet. The author also includes a pretty long list of cheat sheets available on other sites. LewisC

Monday, August 27, 2007

What is included in EnterpriseDB Postgres?

I have had the question several times now so I thought I would blog about. This info was actually covered in the original press release, EnterpriseDB Announces First-Ever Professional-Grade PostgreSQL Distribution for Linux. Professional-Grade PostgreSQL Distribution EnterpriseDB Postgres is an open source, pre-configured, certified binary PostgreSQL distribution that simplifies enterprise deployment, eliminating the need to source multiple software components on the Web and manually assemble them. EnterpriseDB Postgres includes a one-click installer that delivers the most commonly needed PostgreSQL components and add-ons for major operating systems, including:

  • The most recent General Availability version of PostgreSQL, plus:
    • Secure Sockets Layer (SSL)
    • Cryptography (pgCrypto)
    • XML Support (libxml)
    • Full Text Search (TSearch2)
    • Database Linking (DBLink)
    • Languages: pl/pgSQL, pl/TCL, pl/Perl
    • Database Connectors: ODBC, JDBC
  • Graphical administration and monitoring (pgAdmin III and phpPgAdmin)
  • Replication (Slony-I)
  • Geospatial information server (PostGIS)
  • Comprehensive documentation
They are also offering support, documentation and forums. LewisC

Friday, August 24, 2007

Create a Partitioned Table In PostgreSQL

First, we create the table structure that will act as the master table.


CREATE TABLE sales_range 
(salesman_id  integer, 
salesman_name text, 
sales_amount  integer, 
sales_date    timestamp);
Then we create each of the partitioned tables using OO style syntax. Each partitioned table "inherits" the structure of the parent table. There is a string relation between parent and child. You cannot drop the parent without dropping the child.

CREATE TABLE sales_jan2000
  ( CHECK (sales_date < TO_DATE('02/01/2000','MM/DD/YYYY')) )
  INHERITS (sales_range);

CREATE TABLE sales_feb2000
  ( CHECK (sales_date < TO_DATE('03/01/2000','MM/DD/YYYY')) )
  INHERITS (sales_range);

CREATE TABLE sales_mar2000
  ( CHECK (sales_date < TO_DATE('04/01/2000','MM/DD/YYYY')) )
  INHERITS (sales_range);

CREATE TABLE sales_apr2000
  ( CHECK (sales_date < TO_DATE('05/01/2000','MM/DD/YYYY')) )
  INHERITS (sales_range);
At this point, any inserts against the parent table (sales_range) would be inserted into sales_range. To actually use the new partitions, we need to write either a PostgreSQL Rule or a trigger. I like triggers personally. I'll create the trigger function. I'll have a separate IF for each type of operation (tg_op) and an IF for each partition within that operation. You can simplify this with dynamic SQL but I will save that for a future posting.

CREATE FUNCTION sales_range_handler() 
  RETURNS TRIGGER AS $$
DECLARE
BEGIN
  IF tg_op = 'INSERT' THEN
    IF new.sales_date < cast('2000-02-01' as timestamp)
    THEN
      INSERT INTO sales_jan2000
         (salesman_id, salesman_name, sales_amount, sales_date)
        VALUES (new.salesman_id, new.salesman_name, 
                new.sales_amount, new.sales_date);
    ELSIF new.sales_date < cast('2000-03-01' as timestamp)
    THEN
      INSERT INTO sales_feb2000
         (salesman_id, salesman_name, sales_amount, sales_date)
        VALUES (new.salesman_id, new.salesman_name, 
                new.sales_amount, new.sales_date);
    ELSIF new.sales_date < cast('2000-04-01' as timestamp)
    THEN
      INSERT INTO sales_mar2000
         (salesman_id, salesman_name, sales_amount, sales_date)
        VALUES (new.salesman_id, new.salesman_name, 
                new.sales_amount, new.sales_date);
    ELSIF new.sales_date < cast('2000-05-01' as timestamp)
    THEN
      INSERT INTO sales_apr2000
         (salesman_id, salesman_name, sales_amount, sales_date)
        VALUES (new.salesman_id, new.salesman_name, 
                new.sales_amount, new.sales_date);
    END IF;
  ELSIF tg_op = 'UPDATE' THEN
    -- Do the same for update
    NULL;
  ELSIF tg_op = 'DELETE' THEN
     -- Do the same for delete
    NULL;
  END IF;
  RETURN NULL;
END;
$$   LANGUAGE plpgsql;
Then I create the actual trigger:
CREATE TRIGGER sales_range_handler_trg 
  BEFORE INSERT OR UPDATE OR DELETE 
  ON sales_range
  FOR EACH ROW 
  EXECUTE PROCEDURE sales_range_handler();
Finally, I do an INSERT:
INSERT INTO sales_range (salesman_id, salesman_name, sales_amount, sales_date)
  VALUES (1, 'Lewis', 10.25, cast('2000-03-15' as timestamp) );
If you select from the main table, you will see one record. That record is actually being selected from the MAR2000 table. You can select directly from that table and see that the record is there. To optimize your partitioned queries, check out Constraint Exclusion. LewisC

Tuesday, August 21, 2007

10 PostgreSQL versus Everything Else Comparisons

Feature comparisons, in addition to being somewhat lame and almost always biased, are very dependent on versions. If you throw in performance considerations, they are also dependent on hardware and configuration. Even so, I like to read comparisons, for historical information if nothing else. Here are some comparisons that I have found. I make no recommendations or even commentary about them. Read through them as if you are a database anthropologist. Dig for the nuggets that increase your existing base of knowledge but remember that a human, probably one with an agenda, put these comparisons together. I have also found that PostgreSQL is most often compared to MySQL and not to one of the large commercial databases such as Oracle or DB2. Update: August 27. A new comparison: PostgreSQL vs Firebird, August 2007 (and hopefully it will remain updated)

  1. PostgreSQL or MySQL - Feb 15 2005, a fairly nice comparison actually.
  2. PostgreSQL vs. MySQL vs. Commercial Databases: It's All About What You Need - DevX April 12, 2004
  3. PostgreSQL vs MySQL: Which is better?, DatabaseJournal Dec 16, 2003
  4. PostgreSQL vs. SQL Server: PostgreSQL is right for the Microsoft stack, SearchEntepriseLinux May 15, 2006
  5. PostgreSQL vs. Oracle: Users speak out, SeachOracle April 6, 2006
  6. PostgreSQL vs MySQL with Rails, June 18, 2005
  7. Oracle 10g vs PostgreSQL 8 vs MySQL 5, This one I wrote, Aug 22, 2005
  8. Firebird vs Postgres, Forum Post, April 23, 2002
  9. PostgreSQL vs. MySQL (Web Techniques, Sep 2001), Dr Dobbs, Jan 1, 2002
  10. MySQL vs. PostgreSQL, Aug 9, 2005
Enjoy, LewisC

Friday, August 17, 2007

There is a New PostgreSQL Blog Aggregator In Town

I used Yahoo Pipes to create a new PostgreSQL blog aggregator. I am hosting it on squidoo. Check out the Squidoo All About PostgreSQL page. You can vote for your favorite PostgreSQL book, leave comments and check out postgres and database related products. BTW, this is the first I have use Yahoo Pipes. It's very cool and very easy to use. It took me about 10 minutes to use it and about 5 minutes to set up my aggregator. If you have ever used a graphical ETL tool, it looks a lot like that. Take care and check out the Squidoo All About PostgreSQL page! LewisC

Wednesday, August 15, 2007

Learn PostgreSQL: Newbie Questions

A poster on the PostgreSQL pgsql-general mail list asked some good questions that might be important to anyone. The general mail is a:

General discussion area for users. Apart from compile, acceptance test, and bug problems, most new users will probably only be interested in this mailing list (unless they want to contribute to development or documentation). All non-bug related questions regarding PostgreSQL's version of SQL, and all installation related questions that do not involve bugs or failed compiles, should be restricted to this area. Please note that many of the developers monitor this area.

The poster says he is a long time MySQL user and was switching to PostgreSQL. He had several reasons but sumed it up as, "...the true features included in 5.1 (as of this moment) are nothing to write home about. The InnoDB stuff is highly advocated but it has its own set of issues,...".

So he was posting some questions to make sure he was heading in the right direction. You can read the actual email here. I am going to sum up some of the emails for you below but if these questions interest you, I would suggest reading the entire thread.

The poster asked four questions but for this entry, I am going to concentrate on his first question. I may add additional entries later to cover his other questions.

His first question regarded performance on DML around a hot table. 10k inserts/day, 500k selects/day and 1M updates/day. He had heard the wrapping the statements in a BEGIN TRANSACTION; and COMMIT; would increase the performance. This generated many great responses.

Here is a partial list of comments and considerations from this question. Many of these I knew but there were a few nuggets that were new to me. Some of these are exact copies of responses and some are slightly paraphrased by me, but the knowledge is actually coming from others.

  • A transaction is a bunch of queries which you want to all get committed or aborted together. The expensive step in Postgres is the COMMIT. Postgres does an fsync which forces the data to go to disk (if you're using good hardware, and don't have fsync=off). That takes from 5-15ms depending on how much data and how fast your drives are.
  • Grouping more work into a single transaction makes the delay for the fsync at COMMIT time less of a problem. Also having more connections (but not too many, more than a few per processor is probably not helping, more than 100 and it's probably slowing you down) also means it's less important since another process can do some of its work while you're waiting for the fsync.
  • You want "fsync=on" on any machine which holds data you care about. And you want hardware which doesn't lie to you so that "fsync is finished" really means the data is on-disk. Else PostgreSQL cannot ensure ACID compliance.
  • psql runs in autocommit mode by default. If you want multiple queries in a transaction you have to issue a BEGIN statement. Drivers may do various things by default.
  • The rule is, if any query within the transaction errors, then all queries within the transaction are rolled back.
  • Build your application around the application needs first, then later look at how to optimize it.
  • Remember the two rules of optimization: 1) Don't 2) (for experts only) Don't yet
  • The only case where you should consider batching together transactions like that is if you're processing a batch data load of some kind. In that case you have a large volume of updates and they're all single-threaded. But usually in that case you want to abort the whole load if you have a problem.

All in all some really good info. If you are picking up Postgres for the first time, I would suggest you sign up for the general mail list and lurk for a while or just head over to the link above and browse.

LewisC

Sunday, August 12, 2007

THE Postgres Resource Center

Where do you go when you have a question about Postgres? Do you join one of the PostgreSQL.org mail lists? That is so 1990s, isn't it? Not that I would recommend signing up for a postgres twitter but how about checking out the EnterpriseDB Postgres Resource Center. You can get news, downloads, documentation and a support forum. The blurb on the site says:

The EnterpriseDB Postgres Resource Center is a community-based site for enterprise application developers and DBAs and provides a rich repository of technical information, tools, and other resources. The site also hosts community-based forums, enabling EnterpriseDB Postgres users to interact with peers and leverage the collective experience of the EnterpriseDB Postgres community.
The forums are still fairly new so there isn't a lot of traffic yet, but that will come with time. There is a forum for news, postgres beginners, postgres on windows, postgres on linux and postgres on mac. Check it out. LewisC

PostgreSQL Books

Wednesday, August 8, 2007

EnterpriseDB Postgres - Enterprise Grade Database

EnterpriseDB announced at LinuxWorld that they are releasing EnterpriseDB Postgres. This is the open source postgresql with a few items added to make it more of an enterprise class database. This version of Postgres is targeted to new and occassional users. EnterpriseDB is hoping for wide spread acceptance of PostgreSQL which in turn may mean additional acceptance for EnterpriseDB. EnterpriseDB Postgres add an easy to use graphical installer, GUI admin tools (web-based), pgAdmin III (a GUI IDE), replication, forums, and downloads for Linux and Windows. One feature that has had a lot of discussion on the Postgres email list is the installation in Linux. Peter Eisentraut blogged about it on his personal blog, EnterpriseDB and the Professional Grade . Advocates are really pushing PostgreSQL into the mainstream and EnterpriseDB is making one of the biggest pushes. EnterpriseDB also offers support and training for PostgreSQL.

Saturday, August 4, 2007

Not Much PostgreSQL News Lately

I've been looking. I've been waiting. It's just not out there. At least nothing I found to be particularly interesting. If you hear any good bits, drop me a comment or email me.

I did want to post a link to my article on my Expert's Guide to Oracle though. Some waste of valuable bandwidth posted a ridiculous entry about Oracle and PHP. Hope fully you find it worthy of a read.

Saturday, July 28, 2007

PostgreSQL vs MySQL

Have you ever wondered why you should choose one open source database over another? What features would make the most sense for your Organization? Maybe you're a developer looking to learn a database and can't choose where to start?

The folks at PostgreSQL have put together a wiki, Why PostgreSQL instead of MySQL. It's, by no means, complete at this time but it is a good start. The wiki is not editable by the public but it is open for reading. The wiki entry compares PostgreSQL 8.1 and MySQL 5.0. Some of the points raised are:
  • Data Integrity - MySQL has improved with a "strict mode"
  • Database Engine Core - No comparison is complete without a bit of FUD: "It is worth observing that the database engine is part of the core of PostgreSQL, whereas InnoDB is a dual-licensed product currently licensed from Oracle Corporation. It's uncertain how Oracle may alter InnoDB in the future as they act in competition with MySQL AB, whereas PostgreSQL has no such conflict of interest."
  • Speed - MySQL is faster but PostgreSQL is narrowing the gap
  • Application Portability - sparse now but hopefully will grow
I hope this is frequently updated by the PostgreSQL community. It will make a great resource. It would be nice if the people at the PostgreSQL.org website would give a few MySQL developers write access to enter counterpoints to the details in the wiki. An alternative would be for MySQL to answer with their own wiki.

Native UUIDs in PostgreSQL

Stefan Arentz posted an entry on his blog, about a hack he's done for PostgreSQL. He's added a native UUID to the database. This is pretty slick. Most of your users will never notice but it can help on databases that need it. He lists some advantages on the google code page:

Database Size - Using a native UUID type on a table that also has an index on the uuid column saves 47% space. This is good because there is less data on disk and in memory. Which really matters if you have potentially hundred of millions of records. The native UUID type uses 16 bytes of storage whereas a ascii uuid column uses 38 bytes.

Consistency - The native UUID type only accepts UUIDs that are well formatted. Of course you can also do this with a constraint check, but having a native type with a fixed format is more elegant in my opinion.

Transparent - Nobody is going to notice the difference. Because of the implicit typecasting you can use UUIDs as if they are ascii values.

Friday, July 27, 2007

Does anyone use PostgreSQL in Production?

The number of Oracle users is huge and Oracle has the numbers available. The number of EnterpriseDB users is smaller but EnterpriseDB tracks those also. I would like to know how many people are using PostgreSQL in production and what kind of systems they are running.

If you monitor the some of the Postgres email lists, such as General, Jobs, and, maybe the Docs group, you can get a basic feel of some usage but only from the more verbal and connected in the community.

Does Forrester Research or Gartner track open source database usage? I haven't seen anything.

I know the community tracks things like downloads but that is incredibly misleading for production usage. Do you know anyone that is surveying or tracking these kinds of stats?

LewisC

Wednesday, July 25, 2007

GlassFish Benchmark Results Keep Climbing

woodjr over at the sun.blogs.com site that they ran a new SPECjAppServer 2004 with GlassFish v2 and got a better response than what Josh posted a while back. Check out the post,GlassFish Benchmark Results Keep Climbing. LewisC Tags:

Monday, July 23, 2007

MicroOLAP Database Designer for PostgreSQL ver.1.2.1released

MicroOLAP Database Designer is an easy CASE tool with intuitive graphical interface allowing you to build a clear and effective database structure visually, see the complete picture (diagram) representing all the tables, references between them, views, stored procedures and other objects. Then you can easily generate a physical database on a server, modify it according to any changes you made to the diagram using fast ALTER statements. You can get it for PostgreSQL and MySQL.

What's new:
[*] CHECK constraints expression Reverse improved a lot
[*] Model tags are treated case-insensitive now
[*] Disallow modification of columns and constraints referenced by Foreign Key
[+] Smart reference counter added to avoid name duplicates
[+] Ability to choose default folder for model files added
[+] SQL Reverse Engineering for CREATE statements added
[+] "Open last edited files at startup" feature added
[+] Interval backup/autosave
[+] "Leave current tool" options added
[+] "Quick hand tool" feature added
[+] Ability to select references or tables connected to selected table (Popup menu item)
[+] Enhanced reference visualization for selected tables added
[-] "Modify Database is trying to ALTER TYPE for timestamps with the same parameters" bug fixed
[-] "Modify Database is trying to DROP DEFAULT on SERIAL columns on PostgreSQL 8.2.x" bug fixed
[-] "Duplicate operation in Table Editor duplicates column and constraint ID" bug fixed
[-] "Table object context menues "SQL Preview" & "Format" open wrong Table Editor tabs" bug fixed
[-] "Error while reverse scalar types: index out of range(4)" bug fixed
[-] "Storage properties were fetched incorrectly" bug fixed
[-] "Impossible to define SETOF return of the SP in Stored Routine Editor" bug fixed
[-] "Copy/Paste is unavailable in rule actions memo of View Editor" bug fixed
[-] Problems with references numeration
[-] Bug with object search by partial string
[-] Scale Combobox is not updated on Actual Size (F5)

Check it out.