I received a query on how we can extract information from an automated report that arrives in an email and populate an excel spreadsheet.
The Sample report I am going to use is:
Date 6/10/23 Time 13.25 Input File /usr/tract/tmp/slorderin_ABH001_660303893_ET3327624.in
File Type TRACT – TRACT_EDI
EDI Format:
Our Customer XYZ002 61HS04 New Customer XYZ002 Smith, Mr. John
Order No. 123456789
ERROR 12345 McV Jaffa Cake Snack Pack 20x4pack 1.00
ERROR **** Above product is discontinued – removed from order! ***
Our Customer XYZ002 61HS04 New Customer XYZ002 Smith, Mr. John
Order No. 987654321
ERROR 67890 Diet Seven Up 7UP Cans 24x330ml 1.00
ERROR **** Above product is discontinued – removed from order! ***
Our Customer XYZ002 61HS04 New Customer XYZ002 Smith, Mr. John
Order No. 246813579
ERROR 13579 Tango Diet Orange Cans 24x330ml 1.00
ERROR **** Above product is discontinued – removed from order! ***
Observations:
- The first thing we observe is that the report consists of a repeating pattern which is
Our Customer XYZ002 61HS04 New Customer XYZ002 Smith, Mr. John
Order No. 987654321
ERROR 67890 Diet Seven Up 7UP Cans 24x330ml 1.00
ERROR **** Above product is discontinued – removed from order! ***
- The second observation is that the repeating pattern starts with Our Customer.
Approach
We are going to approach the solution by separating out each repeating section in an array.
This can be achieved simply using the split function with the following expression: We use Our customer as the delimiter.
split(outputs('Compose'),variables('NewLine'))
Compose 2 action will return an array of string with the repeating pattern as shown below:
Our next goal is to get rid of the non-repeating first line. We achieve that by using the skip function.
skip(outputs('Compose_2'),1)
We can now run a loop on this array and process each line individually as explained in this blog.