“HammerDB is the leading benchmarking and load testing software for the worlds most popular databases supporting Oracle Database, SQL Server, IBM Db2, MySQL and PostgreSQL.” Source: https://hammerdb.com/index.html
There is a wealth of documentation already available online from the creators of HammerDb and on their own blog that goes into great detail about what HammerDb is and how it works. I’ll try not to repeat all of that here and instead I’ll layout the basics you need to know to get started. In subsequent posts I will talk though the scripts I have created to automate the deployment of this in your environment. I aim to make it quick and simple. Hopefully once you’ve been through these posts, you’ll have a good idea of what HammerDb is, how it works and have the tools and knowledge to set this up in your environment to run your own tests.
Essentially HammerDb is an open source free to use application that is compatible with Windows operating system (amongst others OS’s) and provides two principal benchmark statistics (NOPM and TPM). NOPM stands for New Orders per Minute and TPM is short for Transactions per Minute. These are both variations of another industry standard benchmark called TPC-C. HammerDb can be used to benchmark Microsoft SQL as well as a range of other database platforms.
A bit more detail
TPC-C is a benchmark created to simulate a mix of five concurrent transactions of different types and complexity, these are supposed to be designed to represent any industry that must manage, sell, or distribute a product or service. The benchmark is created and monitored by a council of multiple industry recognised organisations including the likes of Cisco, Dell and Oracle. You can see the full list here http://tpc.org/information/who/whoweare5.asp
HammerDb benchmark isn’t the same as TPC-C and cannot be used for direct comparisons to this as it only implements a subset of the standards required in TPC-C. HammerDb takes the core principals from TPC-C but packages this up in a way that allows anyone to run tests. I must stress this as it is mentioned multiple times on HammerDb’s website, it cannot be used to compare your results to the officially published TPC-C benchmark. So in other words you can’t use it to say to your infrastructure provider,” hey you must be doing something wrong because our SQL server results are X times lower than the industry standard”, but you can use it for your own comparisons to say things like “hey SQL server is estimated to be 3* quicker using disk type XYZ, or config XYZ, or when we do XYZ the performance of SQL looks to increase by 2-3 times”. In fact according to the documentation the HammerDb implementation can be more CPU and memory intensive than the official TPC-C implementation as it removes what is called keying and thinking time. My understanding is keying and thinking time is just as it sounds, the time users would normally take to key in their inputs and think about doing that in the real world. For me the thinking time would probably be longer than the average as I’m not always the sharpest guy on the block hahaha, so its probably a good thing its not used. In saying this you can enable keying and thinking time via a parameter if you so choose.
I can see lots of use cases where this could be useful such as when evaluating the performance gains of increasing the compute resource of your SQL Azure VM’s, comparing how a selected cloud-based provider compares to the performance you get from your current on premise setup, or even testing the impact of config changes such as getting a rough idea of how much on an impact Transparent Data Encryption could have in your environment, or testing how much of an impact installing the latest SSD or flash based disk could bring. There’s probably a lot of other use cases I’ve not even considered, hopefully by now you can see how this could be something useful for you
About the workload
The HammerDb workload simulates a computer system to fulfil orders from customers to supply products from a fictitious company.
• The company sells 100,000 items and keeps stock on warehouses
• Each warehouse has 10 sales districts
• Each district serves 3000 customers
• Customers call the company whose operators take the order
• Each order contains a number of items
• Order are usually satisfied from the local warehouse, however sometimes some items aren’t in stock and need to be supplied by another warehouse.
• You can increase\decrease the number of warehouses and virtual users to simulate larger or smaller businesses and workloads.
• The number of rows in the Item table is fixed, the rest change dependant on how many warehouses you choose to configure
It is worth mentioning that the tool also supports benchmarking OLAP (Online Analytical Processing) systems, however this is out of scope for these articles which will be focusing on an OLTP (Online Transactional Processing) workload; perhaps this would be something to look at in the future.
By default, the utility creates a database called tpcc. The database schema is shown below generated from SQL Diagrams
Note: The warehouse has no relation to a data warehouse (OLAP) database in this schema, it just the name for a relational table.
NOPM VS TPM
As described in the intro, since HammerDb doesn’t implement the full specification of TPC-C it instead has its own two benchmarks NOPM and TPM. Again these aren’t the same as the official TPC-C results and so cannot be compared to them directly. NOPM(New Orders per Minute) should be used as the primary metric since this has been designed in a way that allows you to compare the performance of different platforms i.e. Oracle VS Microsoft SQL or MySQL etc. This could be really useful if you have a colleague who’s obsessed with using MySQL and you want to flex the muscles of SQL Server to show him/her why it costs * times as much (hahaha).
The TPM (Transactions per Minute) counter is useful for scenarios where you want to compare other platform specific performance data such as perflog results. In fact the official HammerDb blog goes as far to say that TPM is essentially the same as SQL Batch Requests per sec perflog counter, just TPM is recorded per minute as opposed to per second. This is great as you can compare it against other perflog stats like disk IO, page life expectancy etc.
There’s a lot of detail about the workload on the official HammerDB site that I’ve glossed over so we can get on with the actual set up, but if your seriously considering making changes to your production environments based on the output of HammerDB results, I’d strongly recommend you research the official documentation in detail before making any costly decisions.
Ok now that the theory’s out of the way lets get on with the setup in
Hammer Part 2 -Let the Nail see the Hammer!
One thought on “Hammer Part 1 -What is HammerDB and why would I use it?”