Tag Archives: database

Running benchmark: Comparing Ubuntu 20.04 and RHEL 8.5 performance

The claim

I found a claim on Quora that Ubuntu is slow compared to RHEL. I never thought about it. Is it really? It seemed like a sentimental statement with nothing to prove it. I questioned the claim and found out, that many people tried to support the claim still without providing any kind proof.

Instead of continuing to asks any evidence, I decided to dig the evidence my self.

Is there any difference in the performance between the two? I don’t really know, but if you think about default server install with nothing extra, I doubt there could be any significant performance difference.

Ubuntu 20.04 has currently kernel 5.4 where as RHEL 8.5 has 5.13. Libraries and software are pretty much the same. File system by default is XFS on both. I always enable LVM although that could affect performance – certainly not by improving it, but there are other advantages. I don’t think LVM reduces performance much either and as said I always set it up anyway.

In this case I have two virtual machines both having 4GiB RAM and two 3.3GHz CPU cores running on qemu/kvm. The host OS is, yes you guessed right, Ubuntu 20.04 , because on desktop it has certain software I need. And, it works well as virtual host too. That does not affect the results anyhow, since the guest OS has no idea of host OS.

I haven’t tuned either of the guests OS at all except for one thing. I set tuned profile to virtual-guest for both, which makes sense. It is the recommended profile when I run tuned-adm recommend on both of the guests machines.

Put the HammerDB down

Last I ran HammerDB I had to settle with text based version, but this time it had a nice working GUI. But even before quick HammerDB installation, I downloaded Db2 11.5.7 Community Edition. Installed it on both Ubuntu and RHEL. I created SAMPLE database with db2sampl and took timing for that: no difference really. I knew it. Ok that doesn’t prove anything.

But, the real test does. HammerDB.

Ubuntu 20.04

Let’s start with Ubuntu. I read a tutorial on how to run time based benchmark with HammerDB. I want to do this fast. One virtual user only. Looks good.

Ubuntu 20.04

The process goes:

  • Choose Engine and configure it (Db2)
  • Build schema
  • Configure and load driver
  • Configure virtual user
  • Create virtual user(s)
  • Run virtual users
  • Monitor and wait
Here are the results for Ubuntu.

The results for Ubuntu 20.04

System achieved 5178 NPM from 22865 Db2 TPM

RHEL 8.5

Then same thing for RHEL. The machine crashes twice. Reminds me of kernel parameters. We have only 4GiB, so might be I need to tune them. But no, it run all good the third time. In my previous job though, servers with low memory running Db2 on RHEL crashed always without tuning the kernel parameters. There’s a simple formula based on RAM to calculate correct values for Db2 here. That said, I did not change anything from defaults for Ubuntu nor RHEL. It wouldn’t be fair comparison, if I started to tune the kernel parameters for one and not to the other.

Running on RHEL 8.5
There’s finally some I/O wait
The winner is RHEL 8.5 by two New Orders Per Minute (NOPM)

The results for RHEL 8.5

System achieved 5180 NOPM from 22815 Db2 Db2 TPM

First conclusion

There is really no difference between Ubuntu and RHEL what comes to achieved performance results. The two new orders per minute makes 0.04% difference which I’m pretty sure no one can notice just by “using the server a bit”.

Comparison between database engines

Since I already started playing with HammerDB, why not try some more tests. I have earlier installed Db2 on the host machine itself as well as MS SQL Server. I also have virtual machine running Oracle Linux 8 on it with the same 4GiB RAM and two CPU core setup. MySQL and PostgreSQL I have running on the host itself.

The hosts OS, as said, is running Ubuntu Desktop 20.04. It has 4 x 3.3GHz cores and 32GiB RAM and fast NVMe 500GiB M.2 PCIe SSD. This is small form factor machine suitable for industrial use as a headless server running for example Linux. Or you can use it as desktop computer as well. My idea for it was to use it as a platform for several virtual guests, but I wanted to see how it works as a Linux desktop computer as well.

Let’s do few quick tests on the host itself for various database engines. More of a test of HammerDB itself than real comparison between the engines.

Db2

TEST RESULT Ubuntu 20.04 Desktop: System Achieved 6651 NOPM from 2928 TPM.

I’m a bit surprised it didn’t achieve more. Need to test more. It takes time for bufferpools to warm up with automatic memory tuning and with 32GiB memory I’m pretty sure we could get much better results.

MS SQL Server

But let’s check with MS SQL Server I have running on the same machine. Certainly Db2 beat MS SQL Server, right?

MS SQL Server gets higher TPM numbers compared to virtual machines
Obviously the benchmark is somewhat different between the engines,

The winner is… oh no, MS SQL Server

8394 New Orders Per Minute with 19243 SQL Server TPM

Oracle

I have one Oracle 21c Server running on VM running Oracle Linux 8. Oh but Oracle – I’m so lost with it. HammerDB asks too much questions and it seems I need to create another pluggable database. I will do that – later.

PostgeSQL and MySQL

Out of curiosity I ran the test for PostgreSQL and MYSQL:

PostgreSQL: TEST RESULT: 11607 NOPM from 26880 PostgeSQL TPM

MySQL: TEST RESULT: 1739 NOPM from 5252 MySQL TPM

I have no idea why the difference between above two is that significant. Might be for various reasons. I wouldn’t pay much attention on the difference since running the test on host OS and not on virtual machines with proper setup doesn’t make much sense – unlike the more serious comparison I did for Ubuntu Server and RHEL.

Final Conclusion

Without official test for Oracle we cannot make any other conclusion than Oracle is the slowest from these three DB Engines: Db2, MS SQL Server and Oracle. I’m kidding of course; I’m no Oracle expert and just too slow myself to set a proper test for Oracle. That might change once I have enough time to dig deeper on Oracle. For MySQL and PostgreSQL the test was also too quick; more of a test do they work similarly in comparison to Db2 and MS SQL what comes to HammerDB.

What comes to the original claim about Ubuntu being overall slow and which surprisingly many is willing to believe, I think I have busted the claim.

We can speculate how about real server environments and please do, but before you actually have any benchmarks to show otherwise, I take it proven that Ubuntu and RHEL are equally slow or fast.

Also, what comes to MS SQL Server performance compared to Db2, obviously this was not the last word. Let’s try with 10 virtual users beating Db2 for a bit longer.

Db2 with 10 Virtual Users

Final results for Db2 running on this tiny Asus Mini PC PN41 were:

TEST RESULT: System achieved 15650 NOPM from 68735 Db2 TPM.

So we have a winner: Db2 11.5.7?

In a sense Db2 won that it did get the highest number of new orders per minute yes. But in comparing with other database engines I didn’t really organise any meaningful tests between them this time.

Want to test yourself?

Prove me wrong. Run your own tests and provide me your data and conclusions. I have serious doubt Ubuntu Server and RHEL differs much what comes to performance. There certainly is plenty of other things which makes the difference when choosing the distribution. Things like support, cost, platform you are running on and so on. Red Hat certainly has it’s advantages on enterprise level support whereas Ubuntu started strong on desktop, but it is easy to deploy for example on Azure and fully supported.

Five steps to tune Netezza query performance

Netezza is designed with simplicity in mind. You can get it up and running in hours rather than weeks. When you follow the basic rules, 99 percent of your applications and queries can perform well. There are six things you should keep in mind while designing your databases and setting up maintenance tasks:

  • Distribution
  • Data types
  • Statistics
  • Zone maps
  • Data organization
  • Groom

When you have taken care of the these things, you shouldn’t have any major issues with performance. However, if you do have issues, how do you find them? Follow these five steps, which can help you to find and possibly fix the performance issues on your appliance.

1. Use IBM Netezza Performance Portal and the query history database

The IBM Netezza Performance Portal is an excellent tool for making sure you have everything in place, and if you don’t it will help you to identify any issues. It provides an excellent front end to the query history database and it is able to connect the appliance performance history with your query history.

Netezza Performance Portal and its installation guide are available for download from IBM Fix Central. The installation is fairly simple. Please refer to the “IBM Netezza Performance Portal User’s Guide” (which is included in the download) for installing and configuring both Netezza Performance Portal and the query history database.

2. Check the server load and resources

One of the greatest things about Netezza Performance Portal is its ability to monitor one or more Netezza systems and their resource usage. From the nice graphical user interface (GUI) you can easily identify performance peaks. Then, if you see any changes in trend, you can drill down and take a closer look by using your mouse pointer to click where the change begins and ends. You can repeat this as many times as you want. After zooming in on the period of time you are interested in, you can click the “Jump to History” button to see which queries were running during that time slot. But first you will need to choose the host you are interested in, as this will populate the “Submit Time” and “Finish Time” fields in the query history view.

3. Identify the problematic queries

Identifying problematic queries is, of course, easier said than done. Basically, there are three different types of long-lasting queries that usually require a closer look:

  • Queries that take a long time to finish because they need to access a lot of data.
  • Queries that take long time to finish because the query is not optimal.
  • Queries that take a long time to finish because the database design is not optimal.

When you first look at your queries, they probably just look like long-lasting queries. However, if you know your data and your queries, you might be able to place at least some of them in the first class I mentioned (queries that naturally take a long time to finish because of large amounts of data). What I usually do myself, after I have zoomed in on a performance peak or otherwise interesting period from the Netezza Performance Portal monitoring view, is sort the queries based on their “Query Duration.” I simply list the queries in descending order and then look at the “Query Text.”

When you have selected the interesting query, you can perform various actions with Netezza Performance Portal, including the following:

  • You can right-click the query and check the “Identical Query Trend Chart.” This will give you an idea of the variation in duration for identical queries over time. For instance, if the system is overwhelmed by concurrently running workloads, it is obvious that a query will not run as fast as it normally would. If you notice that a query took longer than normal to run, you should check what else was running on the system at that time.
  • If it really took longer to run than is typical, you can check the “Query/Plan Activity Chart.” This will give you a nice graphical view of all the queries running concurrently on the system, which could be affecting the duration of the query you were interested in.
  • You can check if the statistics are up to date on tables related to the query, and you can even update the statistics thorough Netezza Performance Portal if they are outdated.You can also check encumbrance. There might have been, for instance, loads or aborted ad hoc queries running on the system that negatively affected the system performance. I have once identified the latter to be the case for why highly-prioritized extract, transform and load (ETL) tasks did not finish in time. Since the queries were aborted, they were not seen in the Query/Plan Activity Chart, but rather on the encumbrance view.

Picture 1: Identical Query Trend Chart

4. Check the query plan

You can check the query plan directly from Netezza Performance Portal and that’s not a bad choice. However, what I usually do is check only the plan ID. You can find this from one of the columns when you are listing the queries on the query history view. I then take the plan ID, log in to the appliance and use nz_plan to take a closer look at the query. One advantage of nz_plan is that it lists the interesting snippets early in the file. Another attractive feature is that it rewrites the query very nicely in a readable format.

That said, you can still use the other techniques available to produce the query plan, including the one available directly through Netezza Performance Portal.

Picture 2: Query Plan generated with nz_plan

 

5. Check the distribution keys and change them if needed

Now that you have the query plan, one of the first things you can check from there are the distribution keys for the tables the query is accessing. Are they what you assumed they were? Are there re-distributions—single or double? If you see something like “1[03]:spu DownloadTableNode distribute into link 2147484337” you know that the table is distributed. If you assumed it isn’t, then you should check the distribution keys again.

Check how it works now when there are no issues

Don’t wait until you have issues. Familiarize yourself on how to monitor performance issues before you have performance issues. If you don’t already have Netezza Performance Portal, install and deploy it. Try and test how nz_plan utility works. Read the query plans. By doing this, you will be ahead of the game and ready to tackle any upcoming issues.

If you have any questions or suggestions related to query performance tuning, please leave a comment. You can also follow me on Twitter @TVaattanen to discuss more about Netezza.