Skip to main content

DataTreeMap a data mining compendium and a place about data collection, analysis and visualization IDEAS.

Latest Posts on data collection

How to collect location data using the Flickr API

May 26, 2017 9:00 PM

Flickr is a fotosharing web service owned by Yahoo. It has about 100 million users and is among top 150 sites as of Wikipedia Other than that Flickr is a wonderfull source of georeferenced information. This article is on how to harvest this information using the Flickr Rest API

continue reading...

How to collect RSS feed data from blogs.

May 24, 2017 8:00 AM

Almost all sites or blogs can be read online or via their RSS feeds. An RSS feed is a simple XML document that contains information about the blog and all the entries. This is a valuable information as it can be elaborated in different ways for example for text analysis. This guide is on how to collect feed data from economist but it can be applied to any site that has a RSS feed section. Lets go through the steps

continue reading...

How to parse a pdf file in Java

April 24, 2017 18:00 PM

Sometimes you need to parse text in your applications and this text might be in a pdf format. In Java it is possible to parse it and create a text file by using the Apache library : Apache PDFBox . This guide is on how to use this Java library to convert a pdf file in a txt file.

continue reading...