Showing posts with label all-flash. Show all posts
Showing posts with label all-flash. Show all posts
Wednesday, March 22, 2017
Testing boundaries - thoughts before start
10:46 AM
As we previous discussed in our previous post regarding what is truly behind the business needs of an AFA box vs the vendor hype that we face, let's assume we have just obtained our technical requirements and we are facing the task of stress one of this boxes.
For any performance test there are several conditions that must always be present and should be considered, these are my initial thoughts hope you find them useful.
- Create "significant" traffic: This means not to only stress performance but to use traffic patterns that are representative for you (i.e. workloads similar to the environment in which you are going to put the device under test)
- Don't forget to measure: An important and also tedious part of the testing is to take notes and write down the results, so always plan your performance scenarios with the idea behind of getting the data exported so you can easily write it down or graph results.
- Test boundaries: If a device is called to reach X performance, test it. Lets say here X is 100K IOPS @8k bs 70/30 (r/w), so you have to get a way to reach that performance in your infrastructure (generate that workloads). Also there are several considerations to take here, i.e. using one thread with that workload is not the same than running multiple which represent a much more real approach, we will deep into this in an specific post about AFA testing procedure.
- Tune the environment: This can also be called environment set-up, be sure to have your underlay infrastructure ready with best practices and no issues to be sure that the test you're running is not getting affected by any other factor than the test procedure itself.
- Automate as much as you can: Doing testing can be tough, imagine that you have to re-test changing few parameters, applying new versions, etc... impossible.. so get an automated approach to set up your test and even shoot them and plot the results in a fancy way.
- Understand what you're doing: Testing is not about running a workload and seeing if performance is good or not, or at least it shouldn't be. The whole purpose behind a testing procedure is to understand how the under-testing device react at stress conditions an under normal plus similar to real ones. Also to notice how does internally works and how behaves under changes... this guided me to the next bullet.
- Resiliency: So you have tested and all seems perfect, performance is outstanding and testing is going well... but have you tested how does this behaves under unexpected and planned changes? Resiliency is key to production environments since it not only gives you an overview of how high availability is performed (which can be important and most for production environments) and also on how doest the systems react to this changes (you can be easily surprised by well-known vendors running in panic mode after switchovers).
- Plan the tests accordingly: If you're running a PoC or a performance test you will do a lot of work for preparation and setup of the environment, this can involve doing changes in physical network to test HA, clusters, or other functionalities, you will lost lot of time by changing lot of times so is really important to order the test plan accordingly meaning to do the minimal changes necessary and in every change do the maximum amount of task prior to next change, this will save you lot of time.
Sunday, March 19, 2017
All Flash Array: vendor hype vs business needs
11:10 AM
Past month I was involved in a project to test the performance of several All Flash Array boxes, let's call them AFAs, since it is expected to start delivering new class of services (and SLA?) to customers.
Being in a R&D team gives you the opportunity to know the whole part of the story, you start with sales pitch, then the pre-sales *not-so-technical and at the end you break a box in a PoC and you end with real engineers which explains you the details of his architecture and why they don't support what you just test (but it was on the sales pitch, right?)
What I do want to remark here is the relation between vendor hype and business needs. AFA can brings you an outstanding performance in amount of IOPS, latency, compression, and so on... but the point is, what do you really need?. In any architecture design you are supposed to deliver a solution that mets technical requirements + business requirements/needs. In AFA this could be quite tricky since if you don't have a clear understanding on your business needs you're going to be pushed to an unfair or un-precise comparison.
In the latter case you have to take a huge amount of considerations in order to do the planning accordingly, I can summarize the following:
For the testing procedure and consideration IDC has written a good recap in theirs "All - Flash Array Performance Testing Framework" and also there is a Tool made by EMC engineers to test arrays called "AFA PoC Toolkit" which setups a few VMs under vmware host, connect the host to the LUNs of the array then makes RDM and set up VDBench on the VMs.
I do recommend using this approach by changing the parameters on VDB files to meet your requirements. Also there is one caveat with that Toolkit and is that in only runs with EMC XtremIO, I've made several changes to be able to run it against PureStorage boxes and i'm doing the changes to be able to test against SolidFire too. In a later post we will discuss design considerations for an AFA platform and the testing tools plus results for each vendor (We're testing PureStorage, EMC XtremeIO, NetApp SolidFire)
Being in a R&D team gives you the opportunity to know the whole part of the story, you start with sales pitch, then the pre-sales *not-so-technical and at the end you break a box in a PoC and you end with real engineers which explains you the details of his architecture and why they don't support what you just test (but it was on the sales pitch, right?)
What I do want to remark here is the relation between vendor hype and business needs. AFA can brings you an outstanding performance in amount of IOPS, latency, compression, and so on... but the point is, what do you really need?. In any architecture design you are supposed to deliver a solution that mets technical requirements + business requirements/needs. In AFA this could be quite tricky since if you don't have a clear understanding on your business needs you're going to be pushed to an unfair or un-precise comparison.
So, what about technical requirements?
I've faced two kinds of scenarios for the AFA deployment. The easier was the deploy of a new infrastructure aimed to fulfill specific requirements for an application suite, this is always the best case scenario for an architect since you can gather technical requirements easily including not only actual requirements bur also a forecast for upcoming demands. As you may know there are not all pink elephants, and the other well seen scenario is to move a current deployment to an AFA solution.In the latter case you have to take a huge amount of considerations in order to do the planning accordingly, I can summarize the following:
Amount of IOPS:This is something that you can see a lot, and is completely wrong, since what is 1K IOPS? Which block size? which read/write ratio?- Amount of IOPS based on a given IO distribution which includes block size and read/write ratio of each of those. Getting this numbers can be significant hard, most of current arrays have the information of all the workloads they had run, this is amazing for getting the IO distribution per block size plus rd/wr ratios but assumes that you are running similar workloads all day long (i.e. consistent distribution of a given set like 30K IOPS @4k 65/35, 15K IOPS @8k 60/40.. and so on, but what about the nightly jobs? when your performance gets affected with backup jobs?)
- Expected and maximum latency: Based on application + OS/Guest OS needs
- Expected compression ratio (for your data set!)
- HA and expected performance in contingency
- Network based (NFS), IP based (iSCSI) or FC?
- Disk replacement policy and MTBF/MTTF
Ok, but where is the vendor Hype?
Well, quick answer is everywhere you get a sales/pre-sales engineer talking, but to be targeted to the topic I've found solutions that claims in a single Box 100K IOPS @32K block size.... easy math here is to ask them how many IOPS did they support in the case of 4/8K if they reach that at 32, also how many bandwidth did they expect?How can I test performance an avoid being seduced by pink elephants?
When we begin the test process we aim at a huge amount of VMs with RDM to the array but later we figured it out that there is no way to do such, with few VMs with lot of disk in each you can easily setup a quick test.For the testing procedure and consideration IDC has written a good recap in theirs "All - Flash Array Performance Testing Framework" and also there is a Tool made by EMC engineers to test arrays called "AFA PoC Toolkit" which setups a few VMs under vmware host, connect the host to the LUNs of the array then makes RDM and set up VDBench on the VMs.
I do recommend using this approach by changing the parameters on VDB files to meet your requirements. Also there is one caveat with that Toolkit and is that in only runs with EMC XtremIO, I've made several changes to be able to run it against PureStorage boxes and i'm doing the changes to be able to test against SolidFire too. In a later post we will discuss design considerations for an AFA platform and the testing tools plus results for each vendor (We're testing PureStorage, EMC XtremeIO, NetApp SolidFire)
Subscribe to:
Comments (Atom)
