A Guide to Extracting Multiple Tables from Web Page with UiPath
Data scraping is transforming the world with its applications. Digital businesses, Marketing and researchers are highly benefited by data scarping. Here is how to extract multiple tables from single webpage using Uipath.
In this blog, we will see how to extract multiple tables from a single web page using Uipath. In Uipath, data scraping plays a major role in extracting data from tables on an HTML page.
Using data scraping, we can extract a single table with n no.of rows and columns. But we cannot extract multiple tables from a single page. By combining find children activity with data scraping, we can extract multiple tables on a web page dynamically.
The find children activity in Uipath gets all the child elements under a particular parent element. The output variable of find children activity will be of Uielement datatype. Using for each, we can iterate through each child element on a web page.
Let’s say we need to extract both the tables from this page. If we use data scraping, we can extract only one table. So we will loop through each element in the body and if the element tag matches with “TABLE” then we will use data scraping and store the data in a data table.
Let’s start with new sequence, drag and drop find children activity. Indicate the parent element under which the table is present. Let the Output variable of find children be ChildElement. Add the message box under the find children activity to know the count of child elements.
Now add for each activity to iterate through the child elements. In the for each activity properties pane, change the type of argument to Uipath.Core.Uielement. Inside for each, add if activity item.get(“tag”).Equals(“TABLE”).
Now use data scraping, indicate the table and extract that table. Clear all the selectors in the extract structured table activity. Enter the item in the Element box, that is present in the properties pane. This will fetch the table based on the element.
To view the table, add output datatable activity, which will convert the data table to string. Add a message box to view the data in the table. Finally, clear the table so that the next table will not merge with the previous table.
We can also particularly get <H3> elements data by changing the if condition to item.Get(“tag”).Equals(“H3”).Inside then, drag and drop get text activity. In the properties pane, enter the item in the element box. So that we can get the data that is inside the Heading tag.