Shaving Those Extra Summer SAP HANA Pounds

Is your SAP HANA database still rocking that summer “dad bod,” packed tight after months of unchecked growth – hoarding memory like it just discovered unlimited barbecue at a family party? You’re definitely not alone. With the end of summer snack season and the rush back to reality, it’s probably a good time for your database to shed a few gigabytes, tighten things up, and remember what it’s like to start up quickly for once.

In this two-part blog, I’ll dig into why trimming down your database can feel a lot like trying to lose a few lingering summer pounds. I’ll walk through some ways to ditch those unnecessary table “calories” and cut out rarely-used data, so your SAP HANA can actually move instead of just loafing around when it’s time to reboot. As with any weight loss plan, there’s a bit of discipline involved—and some clever tricks don’t hurt either.

Just like trading junk food for smarter snacks, keeping SAP HANA slim isn’t about getting rid of all the data you love—it’s more about finding ways to access it without loading up your database every time. Why fill up memory with things you rarely touch? Instead, set up your data “buffet” to grab info as you need it and keep your system running better.

So, in this first part, I’ll show you some ways to get what you need from external sources and virtual tables —meaning your SAP HANA stays lean, energetic, and ready to boot up without those sluggish, post-summer vibes. Let’s take a look at a few guilt-free ways to put your database on a smarter, lighter data diet.

SAP HANA Smart Data Access

“Hey Ryan, I need all 12 million rows of that 10 year old commission data to be loaded into the new SAP HANA system just in case I ever have to query it.”.  Yes – I got that requirement before.  Luckily it wasn’t my first experience with an SAP HANA database (or this business unit), so I know what this meant to database bloat.

I say it all the time, businesses love their data no matter how old it is.  It’s why archiving projects are so quick to get shot down and if your business thinks there may be a delay in accessing data they’ll most definitely find a new, most likely shadow, IT partner to use.

The simplest way to access data without pulling it into the actual SAP HANA database is to use SAP’s Smart Data Access (SDA) functionality.  This commission data sat in a legacy Oracle database and a colleague of mine solved it by setting up a method to only pull the individual transactional data into SAP HANA through SDA if the business actually wanted it.  


SDA is super simple to set-up.  From HANA Cockpit, you add a remote source from the SAP HANA database to your target database.  There are plenty of options you have available, including Oracle, MSSQL, or even another SAP HANA database – which is what I’ll do in this example.

I created a table called SDA_EXAMPLE in my remote source SAP HANA database and then searched for it in my remote source to then create a virtual table.  From there, I can easily query that table across the Smart Data Access source.

Smart Data Access is super easy to set up and allows easy access of remote data saving expensive replication and duplication of data.  How did that commission problem turn out, you might ask?  After a year less then 1% of the data was brought over from the legacy Oracle system.  Big win and a trim database!

SAP Virtual Tables from S3 Object Data

When I ran an SAP ERP in the past, there were many business processes that needed to upload data into SAP through an excel spreadsheet.  We built whole extensions with UI’s to upload data into the database – excel data, CSV’s, you name it – whether it was bulk loads of vendor data, pricing data or tax data plus countless other examples. There were even times when multiple SAP systems needed access to this same data – which led to either a duplicate extension in another system or an expensive integration to replicate that data around.   All of this sounds like a lot of data duplication and data duplication in an SAP HANA system means memory growth.  

But there’s an easier way to handle this – store the data in lower cost S3 Object and create a virtual table using SAP’s Smart Data Integration (SDI) using the File Adapter process.  This allows the data to be maintained in a central location of storage and accessed by multiple systems if necessary. With Pure Storage, S3 Object is currently available on FlashBlade and coming soon to FlashArray. In this example, I’ll be using Pure Storage FlashBlade to store the data.

The first step is to install an SAP Smart Data Integration Data Provisioning Agent.  I chose a Windows VM to install this and I’ll describe why in a minute.  Each SAP HANA environment that you want to connect to will need its own DP Agent – but it’s a light install.  You can install this on the same server as your SAP HANA database, but it’s generally recommended not to.

Once installed, launch into the configuration section.  In the /bin directory of the install you can run “agentcli -configAgent”.  

Register your agent, give it a name and then connect it to your SAP HANA system of choice.

You’ll also need to register the FileAdapter which is what’s used to connect to your S3 bucket.

Back in your Cockpit Database Explorer your Agent and FileAdapter should now be visible and active.

Now that the DP Agent and Adapter are set up, we have to set up our S3 bucket where your unstructured data will be housed.  Pure Storage FlashBlade™ is the perfect spot for this given its low latencies and extremely high throughput.   Head to the storage section of your FlashBlade and to the Object Store.   Click the + in the accounts section to create a new S3 account.

Once created, create a user and give him the necessary access permissions, depending on how your organization has set it up.

Next you can use an existing access key or create a new one – but make sure you copy it down or keep it secure someplace because you’ll need it later when you map your access.

Now go ahead and create your actual bucket, making it multi-site writable.

FlashBlade is so simple to use – even an old database block storage guy like me can manage it.

Back on my Windows box, where the DP Agent is installed, I installed a cool little utility called rclone which has a built-in interface to FlashBlade S3.  Once installed and started, configure it to connect to your endpoint using your access keys that you hopefully saved.

Create a folder in your DP Agent directory which you’ll then mount the S3 bucket to – start up rclone to point to that folder.

Now you have access to a folder on your Smart Data Integration agent that’s connected to your S3 bucket on FlashBlade.  For this demo I went ahead and created a CSV file of fake data that I want to access in my SAP HANA environment and placed it into the share (and thus into the bucket).

Before I can use this data as a virtual table, I need to run a quick config process on the CSV file to ensure SAP HANA knows how to import it as a virtual table.   The batch program to run this can be found in your DPAgent agentutils directory. 

Once that’s complete the CSV file should show up in your Database Explorer virtual source as an object which can be created.

Give it a name and the schema you want it created in and you should be good to go.

The virtual table will add a few columns to show the path it’s being populated from, the name of the file as well as the row number of the data (first row was the column headers).

select * from employeeData

So as you can see, it takes a bit of setup but nothing too strenuous to be able to keep large spreadsheets of data out of your actual SAP HANA environment but still take advantage of using them virtually.

In the world of SAP HANA, size really does matter—especially when every extra gigabyte adds to your memory bill and slows down your system’s ability to bounce back after downtimes. By using smarter strategies to keep your database lean, you’re not just tidying up for the sake of organization—you’re making a real dent in memory costs and slashing downtime burdens. Every byte you keep off the “digital waistline” is money saved and seconds shaved off those tense startup windows.

Tune in for Part 2 where I go into the value of SAP HANA data tiering with Native Storage Extension and a very cool feature that will dramatically speed up your start-up times!