w3m

Difference between version 13 and 14 - Previous - Next
[http://w3m.sourceforge.net%|%w3m%|%] is a text-mode browser.  Sometimes 
when people think they want to do complicated stuff that involves
[invoking browsers] and controlling them with [OLE], [COM], or [CORBA], all they're really after is a bit of Web automation.  w3m and [Expect]
team up to provide that:

======
#!/bin/sh
# magic \
exec expect "$0" "$@"

proc w3m:start {uri} {
    uplevel 1 spawn w3m $uri
}

proc w3m:quit {} {
    send qy\r
    expect eof
}

proc w3m:field_after_label {lab val} {
    send g/$lab\r\t\r$val\r
}

proc w3m:next_field {val} {
    send \t
}
proc w3m:dump_file {name} {
    file delete $name
    send "S$name\r"
}
proc w3m:dump_source {name} {
    file delete $name
    send "\033s\001\013$name\r"
}

w3m:start http://wiki.tcl-lang.org/recent
w3m:dump_file ~/wiki/recent.txt
w3m:quit
exec mail -s "wiki news" [email protected] < ~/wiki/recent.txt
======


et cetera.

The advantage of this method is the following: programming the browser is very similar to just using the browser interactively.  To fill
online form and send it automatically, you don't need to look at html
code (to see field names and form ACTION, which may be
session-dependent, complicating your task even more).

Why have I chosen w3m? Unlike other well-know text-mode browsers, w3m
processes user input synchronously. For example, when I tried to do
the same with lynx, it refused to save recent.txt if it received "S"
before the page loaded; so, the script must expect some
notification from lynx that it's ready to further input.
In case of full-screen terminal application, it's not so simple. But for w3m, there is no such problem.


[A/AK]


Why not use [http]?  This approach starts to show real advantages when
submitting forms.

Anton's thinking of automating a "Web editor" which retrieves a page,
uses his favorite editor locally, and pushes the page back to its 
proper place.

----

[jmn] 2004-04-09

If I have a shell script similar to the above, where I need to return some data on stdout; how do I stop all the output from the w3m child process from trashing the parent's stdout? 

Answering my own question: 

======
log_user 0
======

seems to tidy up stdout nicely.
----
snippet example:
======
set lab "State:"
#select the field
send g/$lab\r\t\r
# click arrow down 4 times, then click to select that value from drop-down input
send "\033OB"
send "\033OB"
send "\033OB"
send "\033OB\r"

\033OA arrow up
======
<<categories>> Internet