Webwatch

pweissbrod · on Oct 26, 2015

crontab -l | {cat; echo " * * * * * if curl 'https://mysite/' | grep -q mysubstring; then echo 'found it'; fi"} | crontab -

jgrahamc · on Oct 26, 2015

I've watched this command change repeatedly as you've been editing it to make it work.

This is a good example of why I didn't do this in the shell.

chelmertz · on Oct 26, 2015

By looking through the git history of your project, it doesn't seem you got it correct the first time either. I don't think gluing "UNIX tools" together biggest strength is "make it work fast & on the first try", but "have independent tools that does 'one thing' well".

In relation to your tool, I think curl provides very many more features that are easily accessible through command flags than the limited subset of HTTP capabilities you expose (for example, basic auth or different set of headers). The same argument goes for mailing, setting headers or such.

With that said, tools that does one thing and does it well are the ones that gets used, personally I'd just prefer it to be function in <your-shell> instead :)

jgrahamc · on Oct 26, 2015

Yeah. The 'git log' really shows all the changes I had to make to the README. Oh, and an error message.

chelmertz · on Oct 26, 2015

I mean, a tool can be really useful (I write tools this size all the time) but some of them needs tweaks forever. I just think some 'tweaks' are already solved by other projects, that's why using already written tools that are somewhat UNIX-y sounds like a good idea to me. That's what I tried to say; of course I don't want you to write a 100% complete program in the first commit, that would make everything I write look really bad in comparison. Just be prepared for that pull request that lands basic auth in your project, and the next PR after that :)

fiatjaf · on Oct 28, 2015

Oh, here's the programmer who never makes mistakes.

pweissbrod · on Oct 26, 2015

I dont know how to view or edit comment history but I never wrote anything in this comment about sending an email.

You could replace echo with sendmail to do that. Sorry if my point came across as callous to you

jgrahamc · on Oct 26, 2015

No need to apologize. There's basically always a way to do it in the shell.

onion2k · on Oct 26, 2015

Nice idea but it needs work. Firstly, and most importantly, any open source project lives and dies on it's documentation. Without a basic guide to what the thing even does no one is likely to to use or support the project. Give some love to your README.md file. How to use the project would be great.

Secondly, at the moment you're just doing a straightforward string comparison on the <body> of a page[1]. It'd be more useful if I could define something like a DOM querySelector or a regexp. It'd also be useful to look in the header at that page title.

[1] At least, I think so. I've never used Go so that's just what I gather from reading the source.

jgrahamc · on Oct 26, 2015

This is a really short little program I wrote for a quick need I had. I added a simple README. There are a ton of ways to improve it (regexp, DOM walking, automatically figure out MX, ...); if people want to do that I'd be happy to take PRs.

I tend to default to "stick it on Github and see if it helps someone else".

onion2k · on Oct 26, 2015

That's a fair comment. I just figured if you were posting it to HN you were looking for feedback.

jgrahamc · on Oct 26, 2015

Happy to have comments and even PRs.

kauegimenes · on Oct 26, 2015

If anyone is looking for something similar that runs in the browser i recommend this two extensions:

Chrome: https://chrome.google.com/webstore/detail/page-monitor/pemhg...

Firefox: https://addons.mozilla.org/pt-br/firefox/addon/check4change/

ludbb · on Oct 26, 2015

I assume I'm jealous for a project that brings nothing new compared to so many other solutions and still grabs 76 stars (as I write this). It seems, after all, github stars are another way to say "I'm popular" and not so much that a project is good.

vonklaus · on Oct 26, 2015

That is a pretty rude comment and I would def. argue it reflects a pretty narrow view of the world. I think the project is alright and it looks quite useful if you need something to curl a site, check something, and blast an e-mail (essentially your own ITTT).

To the point I'm jealous for a project that brings nothing new compared to so many other solutions, I suspect the author of the program needed to call up a website check for an event and get notified; s/he probably found this to be the motivation for building this much more than getting github stars. Other people found it useful as well, and maybe it is easier for people to grep this implementation and build on it than other crawlers.

Most broadly, bitcoin combines a lot of well understood and older technologies into something completely knew. It seems this was your gripe, the project didn't do that. I just want to point out complex coordination and reorganization of current libraries/practices/technologies can be quite useful, novel and interesting.

edit: I actually concur with the above post a bit more now. I do think things done in Go get a but over hyped and if this is what parent was referring to, I suspect s/he was correct even if a bit prickly in expressing it.

jgrahamc · on Oct 26, 2015

I'm surprised this is popular. It was just a quick thing I wrote to solve a specific problem that mattered a lot.

97-109-107 · on Oct 26, 2015

In similiar vein and quite easy to run locally https://thp.io/2008/urlwatch/

lcswi · on Oct 26, 2015

Or Specto.

hartator · on Oct 26, 2015

Is there any perks to pass arguments like this? `-url=http://cloudflare.com`. I was thinking the right way was `--url http://cloudflare.com` or `-u http://cloudflare.com`.

ola · on Oct 26, 2015

It's using the standard `flag` package that comes with Go, as for why `flag` parses args this way I don't know.

scrollaway · on Oct 26, 2015

The go devs are aware of it, and adamant that this stuff is fine and they don't want to make the flag package "any more complex" since it's so easy to install a different one (nevermind that of course people are going to use the builtin one...). I find this absolutely ridiculous given how nonstandard it is in today's shell scripts; -flag is supposed to be interpreted as -f -l -a -g or -f "lag" depending on the -f argument.

sneak · on Oct 26, 2015

i was thrilled when I learned that flag will also do --option=value format as well. it might even do --option value too - test it?

highwind · on Oct 26, 2015

Tell that to Oracle. https://i.imgur.com/fQ6pQLn.png

ozcanesen · on Oct 26, 2015

There is a startup for that https://monitorbook.com/

eric_cc · on Oct 26, 2015

And it was: "Crafted with <3 in San Francisco"

so there is that

RoseO · on Oct 26, 2015

I especially love the feature tick "Push Notifications (coming soon)" as a reason to go for their higher tier subscription.

michaelmcmillan · on Oct 26, 2015

There's quite a lot of edge cases that can be triggered when fetching HTTP responses. Perhaps a small test suite would be beneficial in order to attract new developers that don't feel like breaking anything? (-:

gavreh · on Oct 26, 2015

Similar to https://www.changedetection.com

carsonreinke · on Oct 26, 2015

Or http://followthatpage.com/

therealmarv · on Oct 26, 2015

love changedetection. Using it for years!

olouv · on Oct 26, 2015

I've built the same thing using NodeJS a couple of weeks ago, with phantomjs support (javascript execution), mandrill (emailing) & and some other nice options: https://github.com/mgcrea/node-web-watcher

ramon · on Oct 26, 2015

What string? Is it a webpage modification or just a whois modification? What exactly is it looking for?

arcatek · on Oct 26, 2015

I assume that this program kinda works like "curl <page> | grep <string> && mailx -s 'Match' <email> <<< 'matched'".

Useful when you want to periodically check if a page changed - I already used a similar thing to get concert tickets before anyone else.

That being said, I feel like a browser extension might be more useful than a command line script, for this particular use case.

michaelmcmillan · on Oct 26, 2015

A browser extension is handy, but requires your browser to be open in order work. A script on the other hand can just be thrown up on a server and forgot about.

kauegimenes · on Oct 26, 2015

Its looking for a string in the page HTML body. https://github.com/jgrahamc/webwatch/blob/master/src/webwatc...

runholm · on Oct 26, 2015

Someone need to make a tool to monitor the documentation for change.

TazeTSchnitzel · on Oct 26, 2015

What about the absence of a phrase? I would like to be able to do

  webwatch \
    -url=https://example.com/privacy/ \
    -warnmissing="never received a National Security Letter" \
    -from=me@example.net \
    -to=eff@eff.org

jgrahamc · on Oct 26, 2015

https://github.com/jgrahamc/webwatch/pulls

colinbartlett · on Oct 26, 2015

I understood your point, but you might be better received if you responded something like the following:

"That's a great idea! I have no personal need for such a feature, but if you do and are able to submit a pull request, I'd be pleased to merge it."

chdir · on Oct 26, 2015

Off topic : Is it ok to re-post an ignored article [0] the very next day, just curious, not complaining :)

[0] https://news.ycombinator.com/item?id=10443814

jgrahamc · on Oct 26, 2015

I reposted because I got the following email from HN:

    Hi there,

    https://news.ycombinator.com/item?id=10443814 looks good, but didn't
    get much attention. Would you care to repost it? You can do so
    here: https://news.ycombinator.com/repost?id=10443814.

    Please use the same account (jgrahamc), title, and URL. When these match,
    the software will give the repost an upvote from the mods, plus we'll
    help make sure it doesn't get flagged.

    This is part of an experiment in giving good HN submissions multiple
    chances at the front page. If you have any questions, let us know. And
    if you don't want these emails, sorry! Tell us and we won't do it again.

    Thanks for posting good things to Hacker News,
    Daniel

JorgeGT · on Oct 26, 2015

I got the same mail and indeed my submission went from no attention at all to stay in the front page for a while. I figured out that maybe it had to be with posting time, maybe the email is sent when it's a good time to repost? Or the first upvote is crucial?

bmelton · on Oct 26, 2015

Interesting that the process needed you to repost it for the mods to boost it. Seems like they could have just fiddled with it without you having to manually interact with it.

I can't help but wonder what the logic is there.

eterm · on Oct 26, 2015

Also interesting that HN is moving (has moved?) toward a curated site. HN asks for reposts of things they deem good. They also adjust downward the score of many articles. (As can be seen through large jumps on sites that track article ranks, some of which will be automatic from the flamewar detector, some of which is likely manual.).

It seems like we're reaching a "web 3.0" which uses users to do the expensive bit of an intial sift but then the site admins edit/curate that into their own vision.

We're moving away from user driven content, back to curated content with user-sourcing.

mmahemoff · on Oct 26, 2015

Web 3 or not, I'd see it as an extension of user sourcing, where users have various levels of moderation powers. I would guess these HN emails (I also received one recently and duly reposted) are triggered by some count of admins voting up unloved posts, maybe from a list filtered by a user karma threshold.

As Jeff Atwood says about StackOverflow, it should be possible for a sufficiently privileged user to do just about anything staff can do.

Not really a new concept as /. had the notion of metamoderation, but a richer model with multiple levels of user.

icebraining · on Oct 26, 2015

So Slashdot has survived long enough to be on the vanguard again!

cookiecaper · on Oct 26, 2015

Could you please make this legal in the US by honoring robots.txt and scanning any links to the ToS for words forbidding "automated access", "crawling", "spidering", "polling", etc.?

lfx · on Oct 26, 2015

Hey, jgrahamc neat tool, could you please add bins to repo? I know, I know I can compile my self. But not everybody has luxury install go just to try...

jgrahamc · on Oct 26, 2015

No. I really hate adding binaries to git repos.

jakevn · on Oct 26, 2015

You don't need to add it to the git repo itself. You can create a release in Github and attach binaries.

lekevicius · on Oct 26, 2015

It's like self-hosted Google Alerts for one page.

GPGPU · on Oct 26, 2015

I was going to suggest Google Alerts. It works really well!

ramon · on Oct 26, 2015

Then the description should have started like this: Self-hosted Google Alerts in Go!

That would of saved me a couple of minutes

ola · on Oct 26, 2015

Useful tool for what is a very common task, nice work jgrahamc!

fiatjaf · on Oct 26, 2015

The hard part is to know what string to search for.

kefka · on Oct 26, 2015

Not to demean your work, but I also replicated what you did in Node-red.

And it also goes to twitter. And console. And MongoDB.

It took me 5 minutes.

http://imgur.com/lbHoTIb

(Below is JSON link to replicate what I did)

http://pastebin.com/i6KhuwbX

jgrahamc · on Oct 26, 2015