'''Hacker News''' aka '''HN''' is startup/technology news website created by [Paul Graham]. It can be found at https://news.ycombinator.com/.
**Scraping Hacker News**
** Postings about Tcl **
[{Tcl the Misunderstood} {Hacker News} {2014 01 16}%|%Tcl the Misunderstood], {2014 01 16}:
** Scraping Hacker News **
To run the following script you will need to have [Tcllib] and [tls] installed as well as a copy of the [treeselect] module residing in the same directory. You can download treeselect with [wiki-reaper]: `wiki-reaper 41023 0 8 > treeselect-0.3.1.tm`.
Note: This is just a demonstration. You can use the [https://github.com/HackerNews/API%|%JSON API] as an alternative to scraping and for serious applications you should.
======
# version 0.0.1
::tcl::tm::path add .
package require treeselect
set tree [::treeselect::url-to-tree https://news.ycombinator.com/news]
set nodes [$tree nodes]
set titles [::treeselect::get $tree \[
[::treeselect::query $tree "td.title a PCDATA" $nodes] data]
set scores [::treeselect::get $tree \[
[::treeselect::query $tree ".score PCDATA" $nodes] data]
set links [lmap x [::treeselect::get $tree \[
[::treeselect::query $tree "td.title a" $nodes] data] {
dict get [::treeselect::parse-attributes $x] href
}]
set stories {}
foreach title $titles score $scores link $links { if {$score ne ""{}} {
lappend stories $title $score $link
}
}
foreach {title score link} $stories { puts "($score) $title - $link"
}
======
*** Sample output ***
|` (43 points) Marty.js – A JavaScript library for state management in React applications - http://martyjs.org/ `|
|` (25 points) Show HN: Metamon, a Vagrant/Ansible toolkit for kickstarting Django apps - http://blog.tryolabs.com/2015/01/20/introducing-metamon-for-kickstarting-django-development/ `|
|` (151 points) Emacs Is My New Window Manager - http://www.howardism.org/Technical/Emacs/new-window-manager.html `|
|` (51 points) You’ll Always Miss Being in the Basement - http://zachholman.com/posts/the-basement/ `|
|` (25 points) First LibreOffice for Android app released - https://libreoffice-from-collabora.com/libreoffice-for-android-released/ `|
|` (...) `|
------
**See also**
* [Web Scraping with htmlparse]
** See Also **
[Web Scraping with htmlparse]:
<<categories>>Web