More to life...: April 2017

I wanted to provide an update to the last post I wrote on the mystery around security with on-premise FDMEE and Data Management in the cloud, in the post I went through how data is loaded when the load method is set to “All data types with security” in Data Management, I was interested in understanding the limitations and performance when using this method so I thought it was worth putting a post together on the subject.

If you have not read the previous post I recommend you do so as I will be assuming you understand the background to this topic, the posts are available at:

FDMEE – diving into the Essbase and Planning security mystery – Part 1
FDMEE – diving into the Essbase and Planning security mystery – Part 2

Just to recap if you select the load method “All data types with security” with a non-administrator and are loading to a target BSO cube then the REST API will come into play to load the data, this is only currently available in EPM Cloud and on-premise FDMEE will still load using the Outline Load utility as an administrator.

On a side note it is now possible to set the load method in data load rule instead of just at the target application level.

You will also notice that another new type of method has been snuck in which is “All data types in Planning File Format”, I didn’t see this mentioned in the announcements and new features document but I am going to cover this in a future post.

At the moment there is a slight bug with setting the load methods in the data rule.

When you create a rule the default option is “Numeric Data Only”, say you change this to “All data types with security”

Now if you want to change it back to “Numeric Data Only” it is not possible because it is not in the list of values.

If you manually type “Numeric Data Only” and try and save the rule then you will be hit with an error message.

I am sure it will be fixed at some point in the future and probably without being informed that it has been.

Anyway, back to the main topic of this post and if you load data as a non-administrator with the “All data types with security” method the REST resource that is being called is “importdataslice”.

The documentation provides the following details on the REST resource:

“Can be used to import data given a JSON data grid with a point of view, columns, and one or more data rows. Data will be imported only for cells that the user has read-write access to. Imports data of types Text, Date and Smart List along with numeric data. Returns JSON with details on the number of cells that were accepted, the number of cells that were rejected, and the first 100 cells that were rejected.”

The URL format for the REST resource is:

https://<cloudinstance>/HyperionPlanning/rest/v3/applications/<app>/plantypes/<cube>/importdataslice

As explained in the previous post the user’s security will be honoured with this method of loading data.

When exporting data from Data Management to the target planning application cube a file is created, the file contains JSON and a grid based on the data which is being loaded, this file is then read and forms the body of the post to the “importdataslice” REST resource.

Say I have a cube which has seven dimensions and I am loading the following data set.

Data Management will create the following JSON which contains a grid based on the above data.

The scenario member will be in the column and the rows will contain members for the remaining dimensions.

In terms of planning this will be the equivalent of the following form:

So what about restrictions and performance, well this is the cloud so there are restrictions and if you have ever built a large form you will know it has performance implications.

In the cloud if you try to build a form which has more than 500,000 cells when you open the form you will receive an error message.

The threshold seems to be the total number of cells in the data grid and not just data cells, as the REST resource is built on top a data grid then the following threshold should also apply to Data Management.

Now 500,000 cells may sound like a lot but with Data Management you will be used to loading data in rows and not thinking about total cell counts.

The total possible number of rows of data will also depend on how dimensions are in the cube you are loading data to.

I calculated that with the cube I am loading data to it should be theoretically possible to load to 71,426 rows before hitting the threshold limit.

Just to make sure I added an extra row of data to see what the output from Data Management would be.

The export process failed and checking the log confirmed the reason behind it.

11:38:02,593 DEBUG [AIF]: Overrode info.loadMethod for the non-admin user: REST
11:38:02,688 DEBUG [AIF]: EssbaseService.performLoadRest - START

11:38:02,728 DEBUG [AIF]: requestUrl: http://localhost:9000/HyperionPlanning/rest/v3/applications/REF/plantypes/Plan1/importdataslice
11:56:41,651 ERROR [AIF]: The rest service request has failed: 400 Bad Request - {"detail":"Unable to load the data entry form as the number of data entry cells exceeded the threshold.\n\nCriteria: Number of cells\nError Threshold: 500000\nWarning Threshold: 250000\nCurrent Value: 500003","status":400,"message":"com.hyperion.planning.governor.HspGovernorThresholdException: Unable to load the data entry form as the number of data entry cells exceeded the threshold.\n\nCriteria: Number of cells\nError Threshold: 500000\nWarning Threshold: 250000\nCurrent Value: 500003","localizedMessage":"com.hyperion.planning.governor.HspGovernorThresholdException: Unable to load the data entry form as the number of data entry cells exceeded the threshold.\n\nCriteria: Number of cells\nError Threshold: 500000\nWarning Threshold: 250000\nCurrent Value: 500003"}

11:56:41,655 INFO [AIF]: EssbaseService.loadData - END (false)

As expected the threshold limit error appears in the JSON response from the REST resource, what is concerning is that the error was not generated straight away, if you open a planning form the threshold error message will be generated straight away, with the REST resource it took 18 minutes before erroring out, you also get an idea of the performance implications of using the REST resource, 18 minutes to load 71,427 rows of data is certainly not performant and just to be clear that no data is loaded if the threshold limit is hit.

Now that I know the limit is the same as with data forms I can reduce the number of records.

This time the number of records should equate to just less than the 500,000 cell limit.

Failed again, time to look in the process logs.

12:40:25,072 DEBUG [AIF]: EssbaseService.performLoadRest – START
12:40:25,103 DEBUG [AIF]: requestUrl: http://localhost:9000/HyperionPlanning/rest/v3/applications/REF/plantypes/Plan1/importdataslice

12:58:48,362 ERROR [AIF]: The rest service request has failed: 400 Bad Request - {"detail":"The form RestDataGrid_1493383281849 is very large and could take considerable time to load. Do you wish to continue?","status":400,"message":"com.hyperion.planning.FormWarningException: The form RestDataGrid_1493383281849 is very large and could take considerable time to load. Do you wish to continue?","localizedMessage":"com.hyperion.planning.FormWarningException: The form RestDataGrid_1493383281849 is very large and could take considerable time to load. Do you wish to continue?"}

This time the error seems a little ridiculous because it looks like a generic error related to opening large forms and has little relevance to the data load I am trying to perform, If I open a data form in planning which is just below the threshold the same message is not generated.

I tried reducing the number of rows by 10,000 and then another 10,000 but still the same error was being generated, then I remembered seeing the error message when opening large forms in planning before version 11.1.2.2

I went back to trying to open a large form in 11.1.2.1 and this confirmed it is the same message.

So it looks like the REST resource is using the same data grid functionality that existed in older versions of planning.

In the versions of planning where this message was generated it was possible to set a threshold in the display options of the application.

If you look at the display options in the cloud or on-premise versions from 11.1.2.2 then the setting is not available.

I had a look around for the setting in the UI but unless I have missed it hidden somewhere I couldn’t find where to set it.

Before looking any further I wanted to be sure that it was the same setting and in 11.1.2.1 where the default value was 5000.

I thought the setting meant total cells in the grid but from testing it looks like it is the number of data entry cells, I tested by loading 5000 records from Data Management.

This time the export was successful.

The process log shows that the REST resource was successful and confirms the number of rows of data that were loaded.

14:50:22,865 DEBUG [AIF]: EssbaseService.performLoadRest - START
14:50:22,869 DEBUG [AIF]: requestUrl: http://localhost:9000/HyperionPlanning/rest/v3/applications/REF/plantypes/Plan1/importdataslice
14:51:07,384 INFO [AIF]: Number of rows loaded: 5000, Number of rows rejected: 0

I tested again with 5001 records and the load failed.

2017-04-28 15:01:07,458 DEBUG [AIF]: requestUrl: http://localhost:9000/HyperionPlanning/rest/v3/applications/REF/plantypes/Plan1/importdataslice
2017-04-28 15:01:47,192 ERROR [AIF]: The rest service request has failed: 400 Bad Request - {"detail":"The form RestDataGrid_1493391668666 is very large and could take considerable time to load. Do you wish to continue?

So it looks like it is picking up the setting of 5000 that used to be around in planning in the past.

I then went to see if I could find the setting and ended up exporting the application settings using migration in the planning UI.

This exports an XML file named “Application Setting.xml” and in that file there is a reference to a form warning setting with a value of 5000.

This looks promising so I updated the value to one that would be higher than the number of records that would be loaded through Data Management.

I Imported the application settings back into the application using migration.

Now to try the data load again with a total number of records just less than the main threshold.

Finally, I am able to successfully load the data.

The process log confirms the total number of rows loaded.

INFO [AIF]: Number of rows loaded: 71426, Number of rows rejected: 0

Well what about performance, let us compare the timings for the different load methods with the same amount of records as above.

Setting the load method to “Numeric Data Only” which means a data load file is produced and then loaded using an Essbase data load rule.

20:38:45,064 INFO [AIF]: Loading data into cube using data file...
20:38:45,936 INFO [AIF]: The data has been loaded by the rule file.

The data load time was just under a second.

Now for “All data types with security” with an administrator which means the Outline Load utility be the method for loading.

20:42:40 Successfully located and opened input file.
20:42:49 Outline data store load process finished. 71426 data records were read, 71427 data records were processed, 71427 were accepted for loading (verify actual load with Essbase log files), 0 were rejected.

This data load took a similar time of just under a second.

Time for the REST method with a non-administrator.

20:54:46,385 DEBUG [AIF]: EssbaseService.performLoadRest - START
21:14:43,146 INFO [AIF]: Number of rows loaded: 71426, Number of rows rejected: 0

A data load time of just under 20 minutes.

So using the REST method takes a massive amount of time compared to the other two methods, as this is EPM cloud it is not possible to see the system resources being used by this method but I can imagine it is heavy compared to the other two.

If you are thinking of using this method to load data then consider the limitations and performance impact it has, it is probably only advisable for small data sets.

One of the new features in release 17.04 is the ability to mask data using the EPM Automate utility which also means it should be available through the REST API, the release documentation provides the following information:

“A new version of the EPM Automate Utility is available with this update. This version includes the maskData command, which masks all application data to ensure data privacy. You use this command only on test instances to hide sensitive data from users; for example, application developers, who should not access your business data.”

While the documentation says only to use to only on test instances it should be possible to run the command on a production instance but the reality is you probably wouldn’t want to do that anyway.

The command is available for FCCS, PBCS, ePBCS and TRCS

It is worth pointing out before using the command is that it will update all data and make it meaningless so make sure you have a backup like a maintenance snapshot and run it against the correct instance as there is no going back.

The EPM Automate command syntax is:

epmautomate maskdata –f

-f is optional and suppresses the user input to confirm whether to run the command, you would only really use the parameter if you are going to be automating the process of masking data.

So let’s give it a test drive and run the command.

The output from the command gives us a little more of an insight to what is going on behind the scenes, as this is Software as a Service we rarely get to understand the full story of what is happening.

The process flow is for each database in the application:

Export all data in masked format
Clear all data from cube
Load masked data back into the cube

This applies to all cubes within the application including ASO, for BSO it is a full export so depending on the size of the databases the command could take a while to complete, I will go into more detail on how I believe the masking of data is being achieved shortly.

The following form shows an example of the data before running the mask data command.

After running the command the data has been masked and is now totally meaningless.

At the start of this post I said that the functionality should be available through REST, if you are not aware the EPM Automate utility is basically built on top of REST so if the functionality is in the utility then it should be possible to access it through REST.

The URL format for the REST resource is:

https://<cloudinstance>/interop/rest/v1/services/maskdata

I can replicate this with a REST client using a POST method.

The above response includes a URL which can then be accessed to check the status of the data masking.

A status of -1 means the process is running and a status of 0 indicates the process has completed, any other value would mean the process has failed.

For automation purposes this can be easily scripted in your preferred language, I have put together an example PowerShell script that calls the mask data REST resource and then keeps checking the status until the process has completed.

So how is the masking of data achieved, well I know some people believe that if you put the word cloud into any sentence then some kind of magic occurs but unfortunately I can’t always accept that, the clue was there when running the EPM Automate “maskdata” command:

“This command will export the data in masked format”

In the on-premise world a new feature was introduced into 11.1.2.4 MaxL and the following is taken from the Essbase new features readme.

“The MaxL export data statement includes grammar you can use to make exported data anonymous, wherein real data is replaced with generated values. This removes the risk of sensitive data disclosure, and can be used in case a model needs to be provided to technical support for reproduction of certain issues.”

The Essbase tech ref provides additional information on how the data is masked, for BSO -

“Export data in anonymized format. Anonymization removes the risk of sensitive data disclosure, and can be used in case sample data needs to be provided for technical support. Essbase replaces real data values with incremental values beginning with 0, increasing by 1 for each value in the block.”

If I take the Sample Basic BSO database for simplicity I can demonstrate what is happening in the cloud using MaxL.

The above example is a full data export using the anonymous syntax and the output shows how the cells in each block have been replaced with incremental values.

I know the basic database usually has scenario as a dense dimension but I updated it to sparse for this example.

Each block in the database will have the same incremental values including upper level blocks which means the aggregated values for stored upper level members will be incorrect.

For on-premise if you wanted the data to be aggregated correctly you could run the anonymous export for level 0, clear, load and then aggregate. For the cloud, you don’t have that control so you could run the mask data command, clear upper level data and then aggregate with a business rule.

A spreadsheet retrieve highlights the difference before and after a masking the data.

Moving on to ASO which uses a slightly different approach to masking the data, the documentation provides further details on this:

“Export data in anonymized format. Anonymization removes the risk of sensitive data disclosure, and can be used in case sample data needs to be provided for technical support. Essbase replaces real data values with 1, for each value in the block.”

I am not sure I agree with the statement about block and would prefer for it to say input level cell, also the way the data is anonymized is quite primitive but at least it shows a true reflective of where the data did exist.

Anyway, let us take another example of masking the data by using the anonymous syntax against the ASO sample application.

The exported data contains all the level 0 data values replaced with a 1.

Another spreadsheet retrieve shows the difference of before and after anonymizing the data.

I am going to leave it there but hopefully you now understand the concept of how the data mask functionality works, even though it is a cloud command the process can be replicated on-premise and with a greater level of flexibility.

More to life...

Sunday 30 April 2017

FDMEE – diving into the Essbase and Planning security mystery – Update

Sunday 16 April 2017

EPM Cloud - masking data

Blog Archive

It's not about me