pitchmuc / adobe-analytics-api-2.0 Goto Github PK

View Code? Open in Web Editor NEW

37.0 10.0 24.0 506 KB

Python wrapper for the adobe analytics API 2.0

License: Apache License 2.0

Python 100.00%

adobe-analytics-api python adobe-analytics

adobe-analytics-api-2.0's Issues

UnboundLocalError: local variable 'myProjectDetails' referenced before assignment

findings = mycompany1.findComponentsUsage(myElements,projectDetails=list_details_projects, verbose = True)

I am passing through a list of workspace IDs in list_details_projects and get the following error:

UnboundLocalError: local variable 'myProjectDetails' referenced before assignment

It was working fine yesterday (22/08)...

Cheers
Nick

Request for component-metadata and Tag API methods

Getting and writing tags to single or multiple components at once would be something really useful for my use cases. @pitchmuc Do you think this can be added easily? Or is it rather something for the long future?
https://adobedocs.github.io/analytics-2.0-apis/?urls.primaryName=Reporting%20APIs#/component-metadata%20-%20tags/findAllForCompany

Issue in Adobe request

Hi,
I am trying to capture data from Adobe Analytics API call in Python, but every time I get this error -

error_code: "429050", "message": "Too many requests"
I am using 2 dimensions and it gets stuck randomly at any point. Although the request limit is 120/minute, I got final record count of 40, sometimes 90, sometimes 3080 and at the end it says "too many requests"
Could you please provide some input?
Thanks in advance.
Regards,
Meetu

Invalid argument passed to getReport() in Line 846

https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/480bc8a430c1c53a0018e670877c6596bc3f90bd/aanalytics2/aanalytics2.py#L846-L847

Hey.
First, thanks for your work!
I think that "headers" is an invalid argument passed to getReport() in Line 846-L847 of aanalytics2.py
or am I mistaken?

Best
Philipp

AttributeError: module 'aanalytics2.config' has no attribute 'get'

Sorry to steal more free time from you, but I tried to update today to the newest version (from git+https://github.com/xsavikx/adobe_analytics_api_2.0.git@4fc0e0cf11f3b93c3dae9b8d9f90b0fa6163375c#egg=aanalytics2
, which made the version from early Nov work for me), but now I get this error:

This is the code (aa2 is my api2 object):

cfg.aaconfig_bycomp(company)
cid = aa2.getCompanyId('0')

cfg.aaconfig_bycomp(company) does this:

def function_that_is_called_by_cfg.aaconfig_bycomp(company):
    aa2.config.org_id = 'xxx@AdobeOrg'
    aa2.config.tech_id = '[email protected]'
    aa2.config.api_key = 'xxx'
    aa2.config.header["X-Api-Key"] = aa2.config.api_key
    aa2.config.secret = 'xxx'
    aa2.config.pathToKey = AA_PRIVATE_KEY_PATH_COMP
    aa2.configure(org_id=aa2.config.org_id, tech_id=aa2.config.tech_id, secret=aa2.config.secret,
                  path_to_key=aa2.config.pathToKey, client_id=aa2.config.api_key)
    return aa2

Thanks for help with this.

ValueError: If using all scalar values, you must pass an index

I keep getting this error when creating a dataframe for the dimension function

ValueError: If using all scalar values, you must pass an index

Please share how to resolve this

Don't assume a *NIX system

In your code, you're silently assume a *NIX system (macOS or GNU/Linux).
For example https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L42 could break on a windows machine.
Join paths using os.path.join instead.

getReport does not work with RequestCreator

When running getReport with a populated RequestCreator object, the getReport function raises TypeError: argument of type 'RequestCreator' is not iterable

#print(type(test_report))
#<class 'aanalytics2.requestCreator.RequestCreator'>
myreport = mycompany.getReport(test_report)

It errors at line 2432:

adobe-analytics-api-2.0/aanalytics2/aanalytics2.py

Lines 2425 to 2440 in 658899c

    
           if type(json_request) == str and '.json' not in json_request: 
        
               try: 
        
                   request = json.loads(json_request) 
        
               except: 
        
                   raise TypeError("expected a parsable string") 
        
           elif type(json_request) == dict: 
        
               request = json_request 
        
           elif '.json' in json_request: 
        
               try: 
        
                   with open(Path(json_request), 'r') as file: 
        
                       file_string = file.read() 
        
                   request = json.loads(file_string) 
        
               except: 
        
                   raise TypeError("expected a parsable string") 
        
           elif isinstance(json_request,RequestCreator): 
        
               request = json_request.to_dict()

One solution would be the following:

 elif type(json_request) == str and '.json' in json_request: 
     try: 
         with open(Path(json_request), 'r') as file: 
             file_string = file.read() 
         request = json.loads(file_string) 
     except: 
         raise TypeError("expected a parsable string")

Another would be to move the RequestCreator instance check above the erroring line.

 elif isinstance(json_request,RequestCreator): 
     request = json_request.to_dict() 

 elif '.json' in json_request: 
     try: 
         with open(Path(json_request), 'r') as file: 
             file_string = file.read() 
         request = json.loads(file_string) 
     except: 
         raise TypeError("expected a parsable string")

KeyError: 'numberOfElements'

It is an error I got for many weeks, now after debugging it (and not changing anything to the code), it worked again after failing the first time.
But now, 1 hour later, I tried again and I get this error again:

DEBUG:urllib3.connectionpool:https://analytics.adobe.io:443 "POST /api/compet2/reports HTTP/1.1" 504 0
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\JetBrains\PyCharm 2019.3.3\plugins\python\helpers\pydev_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2019.3.3\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/pycharm-projects/dim28-repo/dim28etl/adobe/cost_check.py", line 176, in
run_script()
File "D:/pycharm-projects/dim28-repo/dim28etl/adobe/cost_check.py", line 124, in run_script
anom_cost_res = aa2.getReport(anom_cost_r, n_result='100')
File "D:\pycharm-projects\dim28-repo\venv\lib\site-packages\adobe_analytics_2\aanalytics2.py", line 726, in getReport
count_elements += report['numberOfElements']
KeyError: 'numberOfElements'

Error when columns in request do not match columns in response from the API

I'm running into an error thrown by pandas: ValueError: Length mismatch: Expected axis has 83 elements, new values have 88 elements (see partial trace below)

I have the strong feeling this is due to the columns returned by the API might not always match the columns as set in the request sent to the API.
One reason why this might be the case, is when a metric is disabled. (I was able to reproduce the issue by adding two metrics in my request: one active and one inactive).
Another reason might be when there is just no data for one of the columns.

I see in your code the columns you set for the pandas dataframe are retreived from the request json, while it might be better to check this against the API response.

Would that make sense?

...
c:\users\...\.virtualenvs\test-jupyter-aa-api-2ao00mzi\lib\site-packages\adobe_analytics_2\aanalytics2.py in _readData(data_rows, anomaly, cols, item_id)
    777     df = _pd.DataFrame(dict_data).T  # require to transform the data
    778     df.reset_index(inplace=True,)
--> 779     df.columns = cols
    780     return df
    781 

c:\users\...\.virtualenvs\test-jupyter-aa-api-2ao00mzi\lib\site-packages\pandas\core\generic.py in __setattr__(self, name, value)
   5285         try:
   5286             object.__getattribute__(self, name)
-> 5287             return object.__setattr__(self, name, value)
   5288         except AttributeError:
   5289             pass

pandas\_libs\properties.pyx in pandas._libs.properties.AxisProperty.__set__()

c:\users\...\.virtualenvs\test-jupyter-aa-api-2ao00mzi\lib\site-packages\pandas\core\generic.py in _set_axis(self, axis, labels)
    659 
    660     def _set_axis(self, axis, labels) -> None:
--> 661         self._data.set_axis(axis, labels)
    662         self._clear_item_cache()
    663 

c:\users\...\.virtualenvs\test-jupyter-aa-api-2ao00mzi\lib\site-packages\pandas\core\internals\managers.py in set_axis(self, axis, new_labels)
    176         if new_len != old_len:
    177             raise ValueError(
--> 178                 f"Length mismatch: Expected axis has {old_len} elements, new "
    179                 f"values have {new_len} elements"
    180             )

ValueError: Length mismatch: Expected axis has 83 elements, new values have 88 elements

As far as I

_readData bug, when 'value' happens to be missing for a row in the report

https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/4c62414d41add59918d57a6a1b983c1044af10cb/aanalytics2/aanalytics2.py#L757

Hi pitchmuc. :)

I encountered another problem and I think, I have a fix for it.
When you pull your data from Adobe, but somehow you have a missing 'value', the code stops with a "KeyError: 'value'".
However, if you replace Line 757 with:
dict_data = {row.get('value'): row.get('data') for row in data_rows}
the code runs through and the resulting observations gets a 'None' as a key (can be changed by row.get('value', xyz) ).
You might still want to add some warning, or something if that happens.

On that topic see maybe:
https://stackoverflow.com/questions/11041405/why-dict-getkey-instead-of-dictkey

Best
Philipp

findComponentsUsage function fails

When I use the findComponentsUsage function, it fails with this log:

retrieving segments
retrieving calculated metrics
retrieving projects details - long process
1344 project details to retrieve
estimated time required : 22 minutes
search started
recursive option : True
start looking into segments
start looking into calculated metrics
start looking into projects
Traceback (most recent call last):
  File "d:\Projects\AA_Logs\aapy.py", line 177, in <module>
    findings = mycompany.findComponentsUsage(myElements,verbose=True,recursive=True,rsidPrefix=True)
  File "C:\Python\lib\site-packages\aanalytics2\aanalytics2.py", line 1669, in findComponentsUsage
    for element in proj['segments']:
KeyError: 'segments'

Happy to help debug this, but this is all the information I have right now.

SendFile returns Response Code and not JSON object

In the ingestion module, the BIDIA class returns the response code and not the JSON object as specified in the official documentation

This issue has been raised in the following discussion.

getReport2 fails if dimensional value filters are used as segments

See this example report which uses an ad-hoc hit-based segment based on a dimensional value filter in the dropzone (Channel = SEA):

In this case there is no segmentId in the JSON and the report fails with the error
{KeyError}'segmentId':

See the example JSON of the globalFilter in question:

[
  {
    'type': 'segment',
    'segmentDefinition': {
      'container': {
        'func': 'container',
        'context': 'hits',
        'pred': {
          'func': 'streq-in',
          'list': [
            'SEA'
          ],
          'val': {
            'func': 'attr',
            'name': 'variables/campaign.1'
          },
          'description': 'Channel'
        }
      },
      'func': 'segment',
      'version': [
        1,
        0,
        0
      ]
    },
    'dateRange': '2021-02-22T00:00:00.000/2021-02-22T00:00:00.000'
  },
  {
    'type': 'dateRange',
    'dateRange': '2021-02-22T00:00:00.000/2021-02-23T00:00:00.000'
  }
]

Rename "deleteCalculatedMetrics" to "deleteCalculatedMetric" (singular) as it is only deleting one metric

https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/4f4213d228796cfd883f3b21182a3c1d9a55ba14/aanalytics2/aanalytics2.py#L945

def deleteCalculatedMetrics(self, calcID: str = None) -> object:

=> the calcID is a string, so it deletes one metric per execution only. That is why the function name with "Metrics" is misleading.

It would also be more consistent to the other calcMetrics function updateCalculatedMetric which also just updates one metric and not multiple ones.

Issues with new compareReportSuites function

I was trying out the (fantastically useful!) compareReportSuites function and encountered some problems:

When using the "dimensions" comparison, there is an error thrown when including certain report suites. From my observation, the error might come from the report suites having the mobile app dimensions enabled, which is the only observed difference from other, non-problematic report suites. The "metrics" comparison does not produce this error. The error message is:

  File "C:\Python\lib\site-packages\aanalytics2\aanalytics2.py", line 1827, in compareReportSuites
    listDFs = [self.getDimensions(rsid) for rsid in listRsids]
  File "C:\Python\lib\site-packages\aanalytics2\aanalytics2.py", line 1827, in <listcomp>
    listDFs = [self.getDimensions(rsid) for rsid in listRsids]
  File "C:\Python\lib\site-packages\aanalytics2\aanalytics2.py", line 593, in getDimensions
    df_dims = df_dims[columns]
  File "C:\Python\lib\site-packages\pandas\core\frame.py", line 3030, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "C:\Python\lib\site-packages\pandas\core\indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "C:\Python\lib\site-packages\pandas\core\indexing.py", line 1316, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['parent'] not in index"

When using the "save" parameter, the name of the file is the same for the "dimensions" and "metrics" comparison. If both are executed in one run of a script, one file overwrites the other. Ideally, the file names would be different for those two versions. Generally it might be a good idea to include a timestamp in the file name, so that the same script can be run multiple times without overwriting previous data. The latter point is basically true for every function that can save the result.
When a dimension or metric is not active in one report suite but active in another, the comparison claims they are equal. From my expectation, having a different activation status should also yield a "different" comparison result, because this is how it would show in the analytics admin interface.

Overall: Great work and thank you again for the awesome functionality!

getReport2 returns incorrect numbers for metrics/pageviews.

getReport2 returns incorrect numbers for metrics/pageviews. These do not match the Adobe portal. getReport returns numbers that do match the Adobe portal.

[Bug] Statistics needed in getReport2()

I would call it a bug but maybe is mandatory for some weird thing. But when I call to getReport2() without statistics block it returns an error.

p:\programdata\anaconda3\lib\site-packages\aanalytics2\aanalytics2.py in getReport2(self, request, limit, n_results, allowRemoteLoad, useCache, useResultsCache, includeOberonXml, includePredictiveObjects, returnsNone, countRepeatInstances, ignoreZeroes, rsid, resolveColumns, save, returnClass)
   2720             dataRequest["statistics"]["ignoreZeroes"] = True
   2721         else:
-> 2722             dataRequest["statistics"]["ignoreZeroes"] = False
   2723         ### Request data
   2724         if self.loggingEnabled:

KeyError: 'statistics'

My json file (modified just in case) is this one, in which I don't have the statistics block.

jsn = """{
    "rsid": "id1",
    "globalFilters": [
        {
            "type": "segment",
            "segmentId": "s21_59c218..."
        },
        {
            "type": "dateRange",
            "dateRange": "2022-01-01T00:00:00.000/2022-02-01T00:00:00.000"
        }
    ],
    "metricContainer": {
        "metrics": [
            {
                "columnId": "0",
                "id": "metrics/visitors"
            },
            {
                "columnId": "2",
                "id": "metrics/visitors",
                "filters": [
                    "0"
                ]
            }
        ],
        "metricFilters": [
            {
                "id": "0",
                "type": "segment",
                "segmentId": "s21_59c218..."
            }
        ]
    },
    "dimension": "variables/daterangeday",
    "settings": {
        "countRepeatInstances": true,
        "includeAnnotations": false,
        "dimensionSort": "asc"
    }
}"""
jsn = json.loads(jsn)

The fix is easy, since I only have to add a random one like this one at the end.

,
    "statistics": {
        "functions": [
            "col-max",
            "col-min"
        ]
    }

But in general, I don't want the statistics but the filteredTotals and totals from summaryData (which I already have), not the col-max and col-min.

Also, with getReport() I could send the json as string, but now I need to parse it with json.loads() before using getReport2(), is not a problem because this way I can modify more stuff, but just in case you wanted to maintain the same behavior in both calls.

Thanks for the library and the new version 😄

API request handling sometimes does not work

I still do get issues with 429 being returned (too many requests) sometimes when I run many requests in a row. E.g. here a client wanted to delete 1000 Workspaces via the Adobe Component Manager for Google Sheets, so I am firing a loop to delete project by project:

From the logs, I see that at least one of the requests had this problem:

ConfigFile

I recently installed your aanalytics2 package . However to get started the config file in you package requires me to fill in org id , api key, tech id, secret and pathtokey. But my org has asked me to use the swagger to get key api key , org id and bearer token to access apis. So, how do I work with your package if I only have org id, api key, bearer token ?

findComponentsUsage KeyError: 'reportlet'

myElements = ['555fa11fe4b0e34068484fc5'] findings= mycompany1.findComponentsUsage(myElements, verbose = True)

retrieving segments
retrieving calculated metrics
retrieving projects details - long process
3953 project details to retrieve
estimated time required : 65 minutes

KeyError Traceback (most recent call last)
in
1 myElements = ['555fa11fe4b0e34068484fc5']
----> 2 findings= mycompany1.findComponentsUsage(myElements, verbose = True)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\aanalytics2.py in findComponentsUsage(self, components, projectDetails, segments, calculatedMetrics, recursive, regexUsed, verbose, resetProjectDetails, rsidSuffix)
1497 if verbose:
1498 print('retrieving projects details - long process')
-> 1499 self.projectDetails = self.getAllProjectDetails(verbose=verbose,rsidSuffix=rsidSuffix)
1500 myProjectDetails = (self.projectsDetails[key].to_dict() for key in self.projectsDetails)
1501 elif len(self.projectsDetails) > 0 and projectDetails is None and resetProjectDetails==False:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\aanalytics2.py in getAllProjectDetails(self, projects, filterNameProject, filterNameOwner, useAttribute, cache, rsidSuffix, verbose)
1353 print(f"estimated time required : {int(len(fullProjectIds)/60)} minutes")
1354 projectIds = (project['id'] for project in fullProjectIds)
-> 1355 projectsDetails = {projectId:self.getProject(projectId,projectClass=True,rsidSuffix=rsidSuffix) for projectId in projectIds}
1356 if filterNameProject is None and filterNameOwner is None:
1357 self.projectsDetails = projectsDetails

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\aanalytics2.py in (.0)
1353 print(f"estimated time required : {int(len(fullProjectIds)/60)} minutes")
1354 projectIds = (project['id'] for project in fullProjectIds)
-> 1355 projectsDetails = {projectId:self.getProject(projectId,projectClass=True,rsidSuffix=rsidSuffix) for projectId in projectIds}
1356 if filterNameProject is None and filterNameOwner is None:
1357 self.projectsDetails = projectsDetails

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\aanalytics2.py in getProject(self, projectId, projectClass, rsidSuffix, retry, cache, verbose)
1306 if verbose:
1307 print('building an instance of Project class')
-> 1308 myProject = Project(res,rsidSuffix=rsidSuffix)
1309 return myProject
1310 if cache:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\projects.py in init(self, projectDict, rsidSuffix)
33 if definition.get('device', 'desktop') != 'cell':
34 self.reportType = "desktop"
---> 35 infos = self._findPanelsInfos(definition['workspaces'][0])
36 self.nbPanels: int = infos["nb_Panels"]
37 self.nbSubPanels: int = 0

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\projects.py in _findPanelsInfos(self, workspace)
67 dict_data["panels"][panel['id']]['nb_subPanels'] = len(panel['subPanels'])
68 dict_data["panels"][panel['id']]['subPanels_types'] = [subPanel['reportlet']['type'] for subPanel in
---> 69 panel['subPanels']]
70 return dict_data
71

~\AppData\Local\Continuum\anaconda3\lib\site-packages\aanalytics2\projects.py in (.0)
66 dict_data["panels"][panel['id']]['name'] = panel.get('name', 'Default Name')
67 dict_data["panels"][panel['id']]['nb_subPanels'] = len(panel['subPanels'])
---> 68 dict_data["panels"][panel['id']]['subPanels_types'] = [subPanel['reportlet']['type'] for subPanel in
69 panel['subPanels']]
70 return dict_data

KeyError: 'reportlet'

getVirtualReportSuites() should not create a CSV file by default

I cannot run getVirtualReportSuites() on a server without write permissions because the function always by default creates a CSV file. This should not be the default behaviour.

The docstring is also misleading:

        save : OPTIONAL : if set to True, it will save the list in a file. (Default False)

Name functions in snake_case

https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L45

PEP8 recommends snake_case instead of camelCase.

Unable to supply absolute path to the private key

Due to the checks applied in the retrieveToken and importConfigFile functions it's currently not possible to supply a configuration with an absolute path to the private key.

While generally, it may be not a biggie, in some environments it is a critical issue. In our case, we're not able to determine the folder from which the app is going to be started, so it's not possible to provide a normal relative path to the key. In such a scenario, we'd like to provide a full path to the key instead.

Server Error - no save file & empty dataframe

Hi Julien
Especially when running a script that does multiple queries (e.g. a loop), I tend to get this message "Server Error - no save file & empty dataframe" after e.g. 2-5 of iterations:

I also added a 60-second delay between each iteration.
The queries try to get 1000 rows of data for e.g. 7 columns which should not be such a big thing.

Is the Adobe API that unstable that so often you end up with no results or is there anything I can do to circumvent this? Especially when running sth in a server environment and not locally, it is crucial that there are no failures.

This is the code I am running, but I guess this is a deeper issue.

    cfg.aaconfig2() # sets the keys etc...
    cid = aa2.getCompanyId('0')
    ags = aa2.Analytics(cid)
    daysback_start = 120  # how far back should the from-date range go?
    daysback_end = 1  # until how many days back should the to-date go? (e.g. 1 = yesterday)
    dates = []
    # generate dates to export
    for x in range(daysback_start - daysback_end):
        fro = (datetime.datetime.now() - timedelta(days=daysback_start - x)).strftime("%Y-%m-%d")
        to = (datetime.datetime.now() - timedelta(days=daysback_start - x - 1)).strftime("%Y-%m-%d")
        dates.append(fro + "T00:00:00.000/" + to + "T00:00:00.000")

    for dateRange in dates:
        logging.info("get search volume for " + dateRange)
        date = dateRange[0:10]
        aa_req_click2order = '''{
            "rsid": "###RSID_REMOVED###",
            "globalFilters": [
                {
                    "type": "segment",
                    "segmentId": "s3676_5cdd6d3ac0fdcb1990ad1fc3"
                },
                {
                    "type": "segment",
                    "segmentDefinition": {
                        "container": {
                            "func": "container",
                            "context": "hits",
                            "pred": {
                                "func": "streq",
                                "str": "de",
                                "val": {
                                    "func": "attr",
                                    "name": "variables/evar37"
                                },
                                "description": "Content Language (Hit - eVar37)"
                            }
                        },
                        "func": "segment",
                        "version": [
                            1,
                            0,
                            0
                        ]
                    }
                },
                {
                    "type": "dateRange",
                    "dateRange": "%s"
                }
            ],
            "metricContainer": {
                "metrics": [
                    {
                        "columnId": "0",
                        "id": "cm3676_5d24c9de20cf0c0470164e3d",
                        "sort": "desc"
                    },
                    {
                        "columnId": "1",
                        "id": "cm3676_5ece3fc52132860483feded1"
                    },
                    {
                        "columnId": "2",
                        "id": "cm3676_5d2f05103d486a2821cad1dc"
                    },
                    {
                        "columnId": "3",
                        "id": "cm3676_5d24c945e1be724fc54d9dac"
                    },
                    {
                        "columnId": "4",
                        "id": "cm3676_5c91330758592d245541238e"
                    },
                    {
                        "columnId": "5",
                        "id": "metrics/revenue"
                    }
                ]
            },
            "dimension": "variables/evar58",
            "search": {
                "clause": "( NOT MATCH 'Unspecified' )"
            },
            "settings": {
                "countRepeatInstances": true,
                "limit": 1000,
                "page": 0,
                "nonesBehavior": "default"
            },
            "statistics": {
                "functions": [
                    "col-max",
                    "col-min"
                ]
            }
        }''' % dateRange

        time.sleep(60) # pause for 60 seconds to avoid throttling limit
        search_cto_raw = ags.getReport(aa_req_click2order, n_result='1000')
        search_cto_df = search_cto_raw["data"]

some getProject requests throw error "segmentGroups"

WARNING:root:Could not get details for project ID: 5877a42abdf14b9445bb811c, error: 'segmentGroups'

The project is older, but I didn't have this issue before. I updated to newest version 0.2.5 from 0.2.4 yesterday. I will try to gather more details.

When I switch on Verbose, it says:
request URL : https://analytics.adobe.io/api/XXX/projects/5877a42abdf14b9445bb811c?expansion=definition%2CownerFullName%2Cmodified%2Cfavorite%2Capproved%2Ctags%2Cshares%2CsharesFullName%2CreportSuiteName%2CcompanyTemplate%2CaccessLevel
statut_code : 200
Then: {KeyError}'segmentGroups'

createConfigFile() Improved argument parsing

creatConfigFile destination argument

I've just been playing around with the library and found a couple of "gotchas" when creating config files using 'createConfigFile()'. These should be quite simple to mitigate, by adding some parsing and error raise logic to the function. If I can find some time this week I'll happily create a branch and PR.

Example 1

Invalid file extensions are not recognised or updated in destination= arg:

>>> import aanalytics2 as api2
>>> CONFIG_FILE = "config.ini"
>>> api2.createConfigFile(destination=CONFIG_FILE)

... this creates a config file named config.ini.json

Example 2

createConfigFile does not support PosixPath type, error is not user-friendly:

>>> from pathlib import Path
>>> import aanalytics2 as api2
>>> CONFIG_PATH = Path("config.json")
>>> api2.createConfigFile(destination=CONFIG_PATH)
  File "/opt/miniconda3/envs/adobe_api/lib/python3.10/site-packages/aanalytics2/configs.py", line 41, in createConfigFile
    if '.json' not in destination:
TypeError: argument of type 'PosixPath' is not iterable

Proposed Solutions

explicit str() type conversion
or.. if isinstance(destination, PosixPath)
use pathlib.Path() to split file path and replace extension

Order of arguments

https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L69

I'd suggest to put the most likely changed argument first. I feel, that this would be the save argument here.

KeyError: 'imsOrgs' when trying to get company ID fails sometimes (probably also needs "retry" functionality

A script failed last night due to an error I sometimes get locally as well.
I am trying
cid = aa2.getCompanyId('0')
But then this happens:

It is imho due to the same problem (instability of Adobe API which sometimes does not return anything, was escalated to Adobe).
I will add a "retry" solution as well (like #21 ) for my stuff, but maybe the aanalytics library should support this "natively".

That being said, this is harder as we don't even get an empty dataframe, we get an error that needs to be caught somehow.

Algorithm 'RS256' could not be found in python

Hi ,

When I tried to run cids = api2.retrieveToken() after importing the configfile I get an error telling "Algorithm 'RS256' could not be found. Do you have cryptography installed? " I have the cryptography package installed and pyJWT package installed as well. I'm not sure what issue is here ?

Bug: DataFrame obtained from getReport method not matching table seen in Adobe Analytics portal.

I go to the Adobe Analytics portal and get my JSON request here (see photo below).

Then, I pass the JSON request through the getReport method and compare the results with those seen in the Adobe Analytics portal.

Bug: The values in the dataframe returned by the getReport method differ greatly from those shown in the table in the Adobe Analytics portal. Everything is correct EXCEPT the numbers seen in the dataframe. The size of thedataframe, as well as the dimension, filters, metrics and rsid are all correct.

Note: No errors are thrown by aanalytics2. I've tried multiple files and have checked that the filter selections used in the portal match those seen in the JSON request.

Avoid Pokémon exception handling

Please try to avoid Pokémon exception handling in
https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L139

If I read the docs right, you should handle a ValueError instead.

Config File - unable to find private key under path

Hi,
Once I create and import the config file, and try
cid = api2.getCompanyId()
cid

I get the following error:

FileNotFoundError Traceback (most recent call last)
in
----> 1 cid = api2.getCompanyId()
2 cid

/opt/anaconda3/lib/python3.8/site-packages/aanalytics2/aanalytics2.py in checking(*args, **kwargs)
27 now = time.time()
28 if now > config.config_object["date_limit"] - 1000:
---> 29 token_with_expiry = token_provider.get_token_and_expiry_for_config(config.config_object, *args,
30 **kwargs)
31 token = token_with_expiry['token']

/opt/anaconda3/lib/python3.8/site-packages/aanalytics2/token_provider.py in get_token_and_expiry_for_config(config, verbose, save, *args, **kwargs)
17 save : OPTIONAL : Default False. If set to True, save the toke in the .
18 """
---> 19 private_key = configs.get_private_key_from_config(config)
20 header_jwt = {
21 'cache-control': 'no-cache',

/opt/anaconda3/lib/python3.8/site-packages/aanalytics2/configs.py in get_private_key_from_config(config)
137 private_key_path = find_path(config['pathToKey'])
138 if private_key_path is None:
--> 139 raise FileNotFoundError(f'Unable to find the private key under path {config["pathToKey"]}.')
140 with open(Path(private_key_path), 'r') as f:
141 private_key = f.read()

FileNotFoundError: Unable to find the private key under path <path/to/your/privatekey.key>.

I tried moving the key to a new path, but I am still encountering the error message above.

Thank you

Data warehouse method

Seems that there are an API call to create a data warehouse extraction https://github.com/AdobeDocs/analytics-1.4-apis/blob/master/docs/reporting-api/data_warehouse.md. It says that I need to use Reporting API and add "source" : "warehouse", but I couldn't find how to do it. I suppose it is because this library doesn't have the Report.Queue method? I think I need to use the original library and build the call by myself, but maybe you wanted to add this section of the APIs.

Limit

Hi,

Why is there a limit of 999 rows? Even when I try to change the limit in settings in the json file or try to change the page, I always get the first 999 rows. Please assist.

Feature Request: Ability to pass through a logging.formatter class to the logger.

Rather than passing through a format string for the logger, would be great if a class could be passed through to use the logger.setFormatter(fmt) method.

getReport methods do not accept requests without a dimension

It seems to not be possible to get simple summary metrics via the wrapper easily.
This report

where the "dimension" is the "All Visits" segment gets me this JSON:

{
            "rsid": "REPORTSUITE",
            "globalFilters": [
                {
                    "type": "dateRange",
                    "dateRange": "2023-03-09T00:00:00.000/2023-03-16T00:00:00.000",
                    "dateRangeId": "last7Days"
                }
            ],
            "metricContainer": {
                "metrics": [
                    {
                        "columnId": "metrics/event8:::0",
                        "id": "metrics/event8",
                        "filters": [
                            "STATIC_ROW_COMPONENT_1"
                        ]
                    }
                ],
                "metricFilters": [
                    {
                        "id": "STATIC_ROW_COMPONENT_1",
                        "type": "segment",
                        "segmentId": "All_Visits"
                    }
                ]
            },
            "settings": {
                "countRepeatInstances": True,
                "includeAnnotations": False
            },
            "statistics": {
                "functions": [
                    "col-max",
                    "col-min"
                ]
            },
            "capacityMetadata": {
                "associations": [
                    {
                        "name": "applicationName",
                        "value": "Analysis Workspace UI"
                    }
                ]
            }
        }

But the function throws an error because the _dataDescriptor always expects a "dimension".
Same if the "dimension" is a date range, e.g.:

I can work around this, but what is the most efficient way to get simply the total number of [[metric X]] for a certain date range (without any dimensions)?

FileNotFoundError: [Errno 2] No such file or directory: '...\\site-packages\\aanalytics2\\eventType_usageLogs.pickle'

I tried to update to the newest version today, but cannot log in anymore. Could it be that the new better logging is related to this? Have no time to check this in detail at the moment, but rolling back for now to older version.

Traceback (most recent call last):
File "C:\Users\Franz\AppData\Local\Programs\Python\Python37\lib\contextlib.py", line 112, in enter
return next(self.gen)
File "C:\Users\Franz\AppData\Local\Programs\Python\Python37\lib\importlib\resources.py", line 201, in path
with open_binary(package, resource) as fp:
File "C:\Users\Franz\AppData\Local\Programs\Python\Python37\lib\importlib\resources.py", line 91, in open_binary
return reader.open_resource(resource)
File "", line 929, in open_resource
FileNotFoundError: [Errno 2] No such file or directory: 'C:\gitrepos\dim28-repo\venv\lib\site-packages\aanalytics2\eventType_usageLogs.pickle'

Feature Request: getRequests() Method

Input a projectId and return all freeform table json requests for that projectId. Is currently a way to do this? Thanks!

Updating segment error

I have created a Service Account (JWT), within the project “API Access”, in Adobe. No problem when reading segment información, or creating a new segment. But when i try to update a segment i am getting this error:

updateSegment = ags.updateSegment('s2571_63e7c27a72073f71d4b9865d',segment_2)
updateSegment

{'errorCode': 'insufficient_access',
'errorDescription': 'No editing access to this item granted. Userid=200423211 does not match ownerid=200283947',
'errorId': '244625f1-8169-4ebd-8b3d-41fe2878fd8a'}

Stick to one quotation style

Normally, you are using single quotes for your dictionaries.
Why do you switch to double ones in https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L81-L85 and following?

Update PyJWT dependency

PyJWT 2.0.0 was released some days ago (see https://github.com/jpadilla/pyjwt/releases/tag/2.0.0)
It would be nice if we could update to the newest release.

Issue with getUsagelogs method

I am trying to execute this code:

import aanalytics2 as api2
import pandas as pd
api2.importConfigFile('C:\your directory\config_analytics_template.json')
login = api2.Login()
cids = login.getCompanyId()
cid = cids[0]['globalCompanyId']
mycompany = api2.Analytics(cid)

df = mycompany.getUsageLogs()

I get a key error on line 9 as below:

KeyError Traceback (most recent call last)
in ()
1
----> 2 df = mycompany.getUsageLogs()

/usr/local/lib/python3.7/dist-packages/aanalytics2/aanalytics2.py in getUsageLogs(self, startDate, endDate, eventType, event, rsid, login, ip, limit, max_result, format, verbose, **kwargs)
1919 self.logger.debug(f"params: {params}")
1920 res = self.connector.getData(self.endpoint_company + path, params=params,verbose=verbose)
-> 1921 data = res['content']
1922 lastPage = res['lastPage']
1923 while lastPage == False:
KeyError: ‘content’

Could you please let know what setup I am missing, so I can take care of this issue

{‘error_code: ‘403025’, ‘message’: ‘Profile is not valid’}

I keep getting this error for all the functions in the Analytics Class.

{‘error_code: ‘403025’, ‘message’: ‘Profile is not valid’}

I have the developer and admin rights for the Analytics Product.

Note: I am able to retrieve the JWT Authentication access token.

Please share how to resolve this

Why globals?

Hi @pitchmuc,

what is your motivation for using globals in this code part?
https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/04e29001025a6440e972efe6b1b9f292aac956a8/adobe_analytics_2/aanalytics2.py#L49-L54

Issue with findComponentsUsage function after last update

The last update introduced new functionality to the awesome findComponentsUsage function. Unfortunately, it now produces an error when called:

retrieving segments
retrieving calculated metrics
retrieving projects details - long process
Traceback (most recent call last):
  File "d:\Projects\AA_Logs\aapy.py", line 177, in <module>
    findings = mycompany.findComponentsUsage(myElements,verbose=True,recursive=True,rsidPrefix=True)
  File "C:\Python\lib\site-packages\aanalytics2\aanalytics2.py", line 1564, in findComponentsUsage
    self.projectDetails = self.getAllProjectDetails(verbose=verbose,rsidPrefix=rsidPrefix)
TypeError: getAllProjectDetails() got an unexpected keyword argument 'rsidPrefix'

From my investigations, I found that a parameter for the getAllProjectDetails function might have been renamed independently from the findComponentsUsage function. Specifically, if I change self.projectDetails = self.getAllProjectDetails(verbose=verbose,rsidPrefix=rsidPrefix) to self.projectDetails = self.getAllProjectDetails(verbose=verbose,rsidSuffix=rsidPrefix), the function works again.

Issue importing config file and generating Login class

When importing the config file and trying to generate the token via legacy or old method, the python interpreter is returning the following error.

Traceback (most recent call last):
File "C:\xxx\helpers.py", line 202, in
cid = aa2.getCompanyId(param)
File "C:\xxx\venv\lib\site-packages\aanalytics2\aanalytics2.py", line 136, in checking
config.config_object["token"] = retrieveToken(*args, **kwargs)
File "C:\xxx\venv\lib\site-packages\aanalytics2\aanalytics2.py", line 88, in retrieveToken
with open(private_key_path, 'r') as f:
PermissionError: [Errno 13] Permission denied: '.

getUsagelogs returning KeyError 'Content'

Hi there,

I followed the link from https://www.fullstackanalyst.io/blog/adobe-analytics/monitor-adobe-analytics-usage-for-free-with-power-bi-and-python/ and was hoping to use this to get some SS reporting running but I keep getting a 'content' error.

Error logs as below:

startDate='2022-07-11T00:00:00-07'
endDate='2022-07-12T14:32:33-07'
mycompany.getUsageLogs(startDate=startDate,endDate=endDate)
2022-07-13 15:28:34,654::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::starting getUsageLogs::2162
2022-07-13 15:28:34,654::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::starting getUsageLogs::2162
2022-07-13 15:28:34,654::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::starting getUsageLogs::2162
2022-07-13 15:28:34,654::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::starting getUsageLogs::2162
2022-07-13 15:28:34,654::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::starting getUsageLogs::2162
2022-07-13 15:28:34,655::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::params: {'page': 0, 'limit': 100, 'startDate': '2022-07-11T00:00:00-07', 'endDate': '2022-07-12T14:32:33-07'}::2182
2022-07-13 15:28:34,655::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::params: {'page': 0, 'limit': 100, 'startDate': '2022-07-11T00:00:00-07', 'endDate': '2022-07-12T14:32:33-07'}::2182
2022-07-13 15:28:34,655::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::params: {'page': 0, 'limit': 100, 'startDate': '2022-07-11T00:00:00-07', 'endDate': '2022-07-12T14:32:33-07'}::2182
2022-07-13 15:28:34,655::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::params: {'page': 0, 'limit': 100, 'startDate': '2022-07-11T00:00:00-07', 'endDate': '2022-07-12T14:32:33-07'}::2182
2022-07-13 15:28:34,655::aanalytics2.aanalytics2.analytics::getUsageLogs::DEBUG::params: {'page': 0, 'limit': 100, 'startDate': '2022-07-11T00:00:00-07', 'endDate': '2022-07-12T14:32:33-07'}::2182
Traceback (most recent call last):

Input In [27] in <cell line: 3>
mycompany.getUsageLogs(startDate=startDate,endDate=endDate)

File ~\Anaconda\lib\site-packages\aanalytics2\aanalytics2.py:2184 in getUsageLogs
data = res['content']

KeyError: 'content'

Do you have any ideas? I'm stuck ^^

Thanks!

Pandas error from getReport2 (for a request with no dimensions and no STATIC_ROW_COMPONENT.

lib/python3.11/site-packages/pandas-1.5.3-py3.11-macosx-12-x86_64.egg/pandas/core/internals/construction.py", line 666, in _extract_index
    raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length

Consider changing the license

The current GNU license is a strong no-go for most users that would like to ever build a non-open-sourced solution.

While it may look like a proper thing to do to use this license, it usually prevents bigger players from using the library because as per the license everyone is obliged to use the same license and to disclose their source code.

I'd really recommend moving towards smth like Apache 2 or MIT license if there are no personal reasons for having the current one.

	if type(json_request) == str and '.json' not in json_request:
	try:
	request = json.loads(json_request)
	except:
	raise TypeError("expected a parsable string")
	elif type(json_request) == dict:
	request = json_request
	elif '.json' in json_request:
	try:
	with open(Path(json_request), 'r') as file:
	file_string = file.read()
	request = json.loads(file_string)
	except:
	raise TypeError("expected a parsable string")
	elif isinstance(json_request,RequestCreator):
	request = json_request.to_dict()

pitchmuc / adobe-analytics-api-2.0 Goto Github PK

adobe-analytics-api-2.0's Issues

creatConfigFile destination argument

Example 1

Example 2

Proposed Solutions

Recommend Projects

Recommend Topics

Recommend Org