I have been writing SQL for a while. Naturally, as a backend or full stack developer, you inevitably run into periods in your career where SQL analysis is a requirement for the job. I love data, so even when I am writing application code you will commonly find me with a database connection open on the command line. Keep reading for my PostgreSQL tips.
If you don’t work full-time in SQL, there are a lot of tips and tricks that database analysts know that could probably be of use to you. If you work full-time in business intelligence or data analysis, some of the tips I will present here will probably be familiar to you. If not, hopefully, you find one or two useful.
Oh, lord. Common table expressions are amazing. If you are an object-oriented developer and you have not heard of a common table expression, this is the tip for you. Common table expressions, or the “with” clause in Postgres, extract functionality into what I would consider the SQL equivalent of a method. Let’s take a look:
with old_users as ( select * from users where id < 100 ) select count(*) from old_users;
In the SQL snippet above, old_users is abstracted and can be referenced in other parts of your query. If you are trying to do database analysis on the fly this can be extremely useful. In particular, sometimes in SQL you will find logic for a data set to be very lengthy, and if you have to reference such a data set multiple times your queries can become very hard to read or to reason about. Abstracting logic into a common table expression keeps your SQL queries legible, keeps logic easy to reuse, and isolates changes to complicated logic in one section of your query instead of spreading it into multiple parts.
Have you ever been refactoring a query on the command line and realized that after every single time, you run the query, you click up in the terminal, and then have to spend a minute scrolling your cursor in the UNIX terminal all the way to the area you want to make a change? And then, you realize you have to do it again after the query runs? Editing SQL queries in the terminal can be really annoying and time-consuming, but so can switching to a different editor to type a SQL query and then pasting it back into the terminal.
Enter the \e, or edit, command. The next time you need to change something in a query you just ran, type \e + enter into the terminal. It will open the command in your system editor. For me, this is vim. When you save changes to the file, it will run them in a terminal shell. This saves so much time when you are refactoring a query or trying to figure out logic that requires you to run a query multiple times. Here’s an overview of how you use \e:
Boom! So much easier than using a terminal to edit queries.
Chances are, if you have worked in a database with a large table, you have run into the fact that the Postgres count command is not very performant. Something that many developers or database analysts don’t really consider is whether they actually need their SQL counts to be 100% accurate. Yes, of course, there are times when you need a correct row count for a table. But, for example, if you are trying to get a rough idea of how many users are in your system, you probably don’t care quite as much. If you don’t need a count that’s 100% accurate, you can select an estimate from the pg_class table, which is extremely fast:
SELECT reltuples AS estimate FROM pg_class where relname = ‘your_table’;
This will beat the speed of a count (*) query every time.
Geographic objects and location-based data is a subject in and of themselves. While it is scary to jump into the world of mapping, I find that the nearest neighbor algorithm is very useful for basic GPS-based applications, especially when doing database analysis. Given a set of GPS coordinates, the nearest neighbor algorithm will provide the closest point to your coordinates within another set of coordinates. So, it’s the closest coordinate to your coordinate essentially. To get the nearest neighbor within Postgres, we use ST_Distance. Here is a query that will give you the 10 closest points to a latitude and longitude:
SELECT name, geom <-> ST_MakePoint( latitude, longitude) AS distance FROM geonames ORDER BY distance LIMIT 10;
Okay, so this doesn’t really fit in with the rest of the content here as it’s not a SQL query. But, if you’re like me, you don’t particularly consider installing a GUI to manage a tool like Postgres. I use Home, Vim, etc. I am a command line dev. This past year I ran into a situation I had not encountered in quite a while where I needed two different applications to run on two different versions of PostgreSQL. I tried to run them side by side. I couldn’t install a second version. So, I tried upgrading my old version of Postgres, but the upgrade failed, and my installation became corrupted. I had to wipe all of my data. One of my colleagues told me I should be using the Postgres app for Mac. I gave it a try and I have to say that managing Postgres installations through the Mac app is a breeze. I can switch between Postgres 11, 12, and 13 easily and don’t have to worry about data directories or failed upgrades.
That’s it for now!
You can find more content like this in our blog section.