How to “merge” or “transform” JSON documents in Azure Cosmos DB


I'm setting up a Chatbot with the Microsoft Bot Framework and Azure. I want to save my "UserState" in a database in order to easily analyze the user data. I managed to save my userState in form of JSON documents in Azure Cosmos DB.

The problem is that each interaction with the bot creates a new "document" in a "collection" in Cosmos DB.

How can I easily merge the data (data structure is consistent) and in the best case have the data in some kind of table? The tool I want to use for analyzing requires .txt or .csv files.

This is a snippet of the JSON file which stores the user data.

{ "id": "emulator*2fusers*2f9321b527-4699-4b4a-8d9d-9cd9fa8f1967*2f", "realId": "emulator/users/9321b527-4699-4b4a-8d9d-9cd9fa8f1967/", "document": { "userData": { "name": "value", "age": 18, "gender": "value", "education": "value", "major": "value" }, "userDataExtended": { "roundCounter": 3, "choices": [ "A", "A", "B" ], }, "_rid": "0k5YAPBrVaknAAAAAAAAAA==", "_self": "dbs/0k5YAA==/colls/0k5YAPBrVak=/docs/0k5YAPBrVaknAAAAAAAAAA==/", "_etag": "\"ac009377-0000-0000-0000-5c59c5610000\"", "_attachments": "attachments/", "_ts": 1549387105 }

In the best case I want to have the data in a table structure with columns "name", "age", etc. and each user (document) as a row.

Thank you!


There's a few things in your questions and I'll address them all separately.

<h3>Expanding on Drew's comment:</h3>

You have multiple documents being created because you're running the bot through emulator. Each time emulator restarts, it creates a new User ID and therefore a new document for the user and also one for that user's conversation. You will not have this issue if you use a channel other than emulator, provided that the User ID remains consistent.

<h3>Regarding merging documents:</h3>

I'm not sure exactly what you're looking for, but you might be able to use SQL Queries to accomplish what you need. Just click "New SQL Query". For example, running SELECT * FROM c merges all of the documents into a single output.

<h3>Regarding text/csv files:</h3>

I'm not sure what your tool is, but if it can handle JSON, then the above might work for you. If not, you can implement custom middleware to get the txt/csv output you're looking for. Here's a sample that shows something relatively similar. There isn't an equivalent example in C#, but you can still implement your own middleware to do the same thing.

<h3>Regarding Tables:</h3>

If you're really looking for Table Storage, it was supported in V3 bots, but replaced by blob storage in V4. You could write your bot in V3. Similar to what Jay said, you might still be able to use a trigger function to send it to table storage, but then you're storing the data twice.

<h3>Regarding Analysis</h3>

If all you're really looking for is analysis, Application Insights/Bot Analytics may be what you need, although I don't believe it will provide the detail you're looking for.


In the best case I want to have the data in a table structure with columns "name", "age", etc. and each user (document) as a row.


Obviously,you need to use some other services to implement this requirement because the data which is collected by bot service already exists.

In my opinion, maybe the cosmos db trigger azure function is a good option for you. The function will be triggered when any updates inflow into your cosmos db collection.

Of course you could get more explanations from this link ,then what I want to say is that you could configure the Cosmos db as input binding and Azure Blob Storage as output binding (maybe a specific csv file). In the function,you could get your desired columns with cosmos db sdk and assemble them into any format you want.



  • How to “merge” or “transform” JSON documents in Azure Cosmos DB
  • Defining Azure VM CustomScriptExtension in Terraform (Expecting state 'Element'.. Encounte
  • How to convert OpenLayers polygon coordinates into latitude and longitude?
  • MySQL InnoDB deadlock problem with two same queries (different parameters)
  • Django 1.3 URL rewriting
  • Do I have to use a signal handler for a Posix timer?
  • Configure Capybara to use Marionette WebDriver for Firefox
  • MySQL WorkBench - How come the Duration time + Fetch time < real waiting time
  • Python comparison ignoring nan
  • Noobie Jquery Question - Why doesn't this simple script work?
  • casting inside conditional operator in Java
  • Take photo using webcam is giving black output[Unity3D]
  • Struts 2 + Sitemesh 3 integration - NPE in FreemarkerDecoratorServlet
  • Recursively calling an asynchronous API call
  • heroku and sails app | crashes and timeouts
  • Jackson @JsonRawValue for Map's value
  • “Too few positionals” in macro definition
  • custom gradle plugin causes: Cannot configure the 'publishing' extension
  • Mvn compile before exec
  • Null check vs Optional is present check
  • Node.js - Configuring $NODE_PATH with NVM
  • Not able to boot from usb
  • PHP deleting from database not working
  • Google Cloud Functions: Project layout for Github publishing
  • Can we use AmqpItemReader and AmqpItermWriter for request/reply use case in spring batch?
  • Referencing a table in web2py before defining it
  • JFactory not found
  • how to make a checkbox enable and disable a text box in multiple cases
  • Combine solr's document score with a static, indexed score
  • Populate ListView with ArrayList having String array as elements
  • file uploading successfully, but 0kb file is uploading on remote server using jsch sftp java
  • Hibernate Search does not work woth composite primary key using @IdClass
  • Sphinx4 ConfidenceResult and SpeechResult
  • htaccess proxy to node app
  • Need to pass object and operation in a function that executes it
  • Saving CLLocation error: Mutating method sent to immutable object
  • When i select a Textfield the keyboard moves over it
  • Draw string with normalized scientific notation (superscripted )