How big is too big for a PostgreSQL table?
I'm working on the design for a RoR project for my company, and our development team has already run into a bit of a debate about the design, specifically the database.
We have a model called Message that needs to be persisted. It's a very, very small model with only three db columns other than the id, however there will likely be A LOT of these models when we go to production. We're looking at as much as 1,000,000 insertions per day. The models will only ever be searched by two foreign keys on them which can be indexed. As well, the models never have to be deleted, but we also don't have to keep them once they're about three months old.
So, what we're wondering is if implementing this table in Postgres will present a significant performance issue? Does anyone have experience with very large SQL databases to tell us whether or not this will be a problem? And if so, what alternative should we go with?
Rows per a table won't be an issue on it's own.
So roughly speaking 1 million rows a day for 90 days is 90 million rows. I see no reason Postgres can't deal with that, without knowing all the details of what you are doing.
Depending on your data distribution you can use a mixture of indexes, filtered indexes, and table partitioning of some kind to speed thing up once you see what performance issues you may or may not have. Your problem will be the same on any other RDMS that I know of. If you only need 3 months worth of data design in a process to prune off the data you don't need any more. That way you will have a consistent volume of data on the table. Your lucky you know how much data will exist, test it for your volume and see what you get. Testing one table with 90 million rows may be as easy as:
select x,1 as c2,2 as c3 from generate_series(1,90000000) x;
Limit Value Maximum Database Size Unlimited Maximum Table Size 32 TB Maximum Row Size 1.6 TB Maximum Field Size 1 GB Maximum Rows per Table Unlimited Maximum Columns per Table 250 - 1600 depending on column types Maximum Indexes per Table Unlimited
Another way to speed up your queries significantly on a table with > 100 million rows is in the off hours cluster the table on the index that is most often used in your queries. We have a table with > 218 million rows and have found 30X improvements.
Also, for a very large table, it's a good idea to create an index on your foreign keys.