Data Mining vs Screen-Scraping
facts mining isn’t always display-scraping. I understand that some people in the room can also disagree with that statement, but they may be surely two almost absolutely distinctive ideas. google scraper
In a nutshell, you would possibly nation it this way: display-scraping lets in you to get facts, where records mining allowsyou to investigate facts. it is a pretty massive simplification, so i’ll tricky a chunk.
The term “display screen-scraping” comes from the vintage mainframe terminal days in which people worked on computer systems with green and black monitors containing simplest textual content. display screen-scraping becomeused to extract characters from the displays in order that they can be analyzed. rapid-forwarding to the net global of these days, display-scraping now maximum commonly refers to extracting statistics from web websites. that is, computerprograms can “move slowly” or “spider” thru internet sites, pulling out records. humans frequently do this to constructsuch things as comparison buying engines, archive net pages, or honestly download text to a spreadsheet so that it canbe filtered and analyzed.
statistics mining, then again, is described through Wikipedia as the “exercise of mechanically looking massive shops of records for patterns.” In different words, you have already got the facts, and you’re now studying it to analyze usefulthings approximately it. information mining regularly entails masses of complex algorithms based on statistical techniques. It has nothing to do with how you acquire the information within the first area. In facts mining you simplestcare approximately analyzing what’s already there.
the problem is that individuals who do not know the time period “display screen-scraping” will strive Googling for anything that resembles it. We include some of these phrases on our web site to assist such oldsters; as an instance, we created pages entitled textual content records Mining, automated facts collection, internet website informationExtraction, and even web website online Ripper (I suppose “scraping” is sort of like “ripping”). So it affords a bit of a hassle-we don’t necessarily want to perpetuate a misconception (i.e., display screen-scraping = statistics mining), however we also must use terminology that human beings will sincerely use.