Using cron to collect daily data

Tags

, ,

Checking on Jakarta Composite Index (JCI) and other stock exchange indexes, and then writing them down, can be a tedious process. Luckily Yahoo Finance provides a readily downloadable information in CSV format. So, for example, for JCI you can download the latest information here.

I am interested in this kind of data for playing around, for example, using them for data visualisation. I thought Yahoo Finance will have RSS feeds, but it turned out the feeds are only for news. I also thought that IFTTT couldn’t download files (turned out it can, but not really suitable for my purpose). So I turned to cron and curl to automate the job. I write down the process here mainly as note to myself (I don’t usually write shell scripts).

The idea is to download the file at particular time (I choose 4.30pm) every working day. This  means this kind of entry in crontab:


30 16 * * 1-5 sh /your/path/to/home/directory/bin/download-jkse.sh

This means the script download-jkse.sh will be executed at 16.30 every day of the month, every month, but only from Monday to Friday (working days). Next step is to write the script itself.

The main work is done by curl, which downloads the appropriate file at chosen time:

curl "http://download.finance.yahoo.com/d/quotes.csv?s=%5EJKSE&f=sl1d1t1c1ohgv&e=.csv" >> /home/gombang/JKSE/jkse.csv

The URL is quoted, because the special characters in the URL are playing havoc with curl. We only want one file at all times, so we concatenate the downloaded file to the end of a CSV file (jkse.csv), using operator >>.

That’s should be the only line in the script, if only I had got an access to an always online server. I don’t, so I have to check whether my laptop is fully connected to Internet. I do this using nmcli, the command line client of NetworkManager. If we are not connected, wait for 5 minutes, and check again. If connected, download the file.

The full short script turned out like this:

!/bin/bash
while [ $(nmcli networking connectivity) != "full" ]; do
sleep 300
done
curl "http://download.finance.yahoo.com/d/quotes.csv?s=%5EJKSE&f=sl1d1t1c1ohgv&e=.csv" >> /your/path/to/home/JKSE/jkse.csv

This added the latest data to file jkse.csv everyday at 4.30pm, or later, if at the time we are not connected, or (thanks to anacron), we are not up yet. This script is easily extended to download additional files like, for example, data of another stock exchange. We may need to check whether a particular day is a holiday (then don’t download any data), but then it maybe easier just to clean up the csv file later.

Resources:

  1. Introduction to cron
  2. Bash programming howto

Digital journalism tools

Tags

, ,

I came across OSJourno yesterday. It is dubbed as “robust power tools for digital journalists”. It is basically a Fedora remix packaged as virtual machine appliance, or (alternatively) a LiveCD. I plan to download and try it out soon, but for now I’ll just note one thing: being a “digital journalist” seems to demand a lot of skills.

If you notice the tools provided by OSJourno, digital journalists should be skilled at: web development, programming (R, Python, Julia), statistics, data science (natural language processing, machine learning, social network analysis, and finance). Not to mention secure communications (encrypted emails and instant messaging, privacy tools, etc). I doubt they teach those at journalism school. At least not now, not in Indonesia.

KDE Plasma 5.2

Tags

,

I lack excitement in life, so I upgraded the operating system on my laptop. Running alpha software is never boring. It is Fedora 22 with KDE Plasma 5.2. Here is a screenshot:

plasma5.6

Other than a bug in systemd that obstructed the upgrade (easily solved with updating systemd from updates-testing repository), it has been smooth sailing so far.

Fixing hibernation on Fedora 21

Tags

,

After latest update (29 March), my laptop which runs Fedora 21 could’t resume from hibernation. I can’t determine which update broke the system, but usually the first suspect is the kernel. Reverting to older kernels doesn’t fix the problem, so I guess there are other factors in play.

After some detective work I could see that the system failed to read hibernation image when resuming. My first guess that there was something wrong with swap partition, but it seemed OK.

Further Googling and reading about how to enable hibernation on Linux suggested me the following solution: GRUB should inform the Linux kernel where to find hibernation image (i.e. the swap partition).

That should be straightforward: edit /etc/default/grub, add the following line:


GRUB_CMDLINE_LINUX_DEFAULT="resume=dev/disk/by-uuid/6fffdb46-0e7c-4fee-ace4-75cdb30fad5c"
,
where 6fffdb46-0e7c-4fee-ace4-75cdb30fad5c is the uuid of my swap partition (yours should be different). Next step is to run grub-mkconfig (grub2-mkconfig on Fedora 21):


grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

Reboot, and hibernation works again. I still don’t know how the update broke it in the first place though.

Project Gutenberg and cloud storage integration

Tags

, , , ,

I was writing a blog draft (about politics, so I don’t think it will come out here) when I thought I would benefit from reading Tocqueville’s Democracy in America. This is an old book, written in 19th century, so naturally it is in Project Gutenberg’s collection. The original is in French, but it was translated quickly into English.

Project Gutenberg now offers the option to download ebooks to online storage service, such as Dropbox (and Google Drive and Microsoft One Drive). I have known this for some time, but usually I just ignore it. But this time I was intrigued to try.

If you choose downloading to Dropbox, the service will ask your permission to authorise Dropbox. If agreed, Project Gutenberg will create a new directory inside your Dropbox folder, and place your downloaded books there. This means I am now able to access my downloaded Project Gutenberg’s books using Dropbox service. Which is nice: I don’t always bring my laptop with me. I haven’t installed the Dropbox app on the phone, but I think I will do now.

Shellshocked: A collection of links about the Bash bug

Tags

, ,

I don’t really think I understand the new bash bug, cutely named as “shellshock”. I’ll just use this post as a dumping ground for links I have quickly collected. But some explanation first.

Bash is a shell, or command line interpreter, used by various Unix-derived or (in case of Linux) clone operating systems. It is included not only in various GNU/Linux distributions, such as Red Hat, Ubuntu or Debian, but also in a more mainstream operating system: Mac OS X. There is a bug in Bash that enable attackers to execute commands remotely, and potentially enable them to do naughty things.

As far as I know, usual PC systems are largely not affected. Most PCs use Windows, which doesn’t use Bash, thus not vulnerable to this bug. Even personal machines that use Linux or Mac OS X usually doesn’t enable remote service that can be used by attackers to exploit this bug.

The systems affected will be mainly servers. Although Windows rules personal computers, a very large portion of servers run Linux or some version of Unix. Most web servers, in fact, runs on Unixes (with Apache or Ngix). So even though your own system may not be vulnerable, Internet as whole has a great problem.

On to the links. I may add new ones.

  1. Initial report from Akamai
  2. An overview of the bug from Troy Hunt (via Hacker News)
  3. Fedora Magazine explains how the vulnerability works
  4. Apple says most Mac OS X users won’t be affected, but you should update anyway when a patch is released.
  5. The initial patch doesn’t really eradicate the problem. And there may be more bugs lurking.
  6. The bug is quickly exploited
  7. Oracle products (other than obvious one like Solaris and Oracle Linux) are affected as well.
  8. Good explanation about several techniques that can be used to exploit this vulnerability. 

Doing scheduled blog posts

Tags

, ,

I have some blog drafts lying around. I plan to release them in a sequence. The problem is I haven’t finished them in the right order. It’s seems that I will have to complete them all, then doing staggered posting.

It turns out that Blogger (where I save the drafts in question) have this feature called scheduled posting, something I haven’t explored before. Of course I have heard about it, but I didn’t really pay attention. WordPress and Tumblr also have the aforementioned feature. I guess I’ll use this feature more often.

Of course, this post is scheduled :)

Tails: quick review

Tags

, , , , , ,

tails
Like all trendy, privacy-conscious, NSA-suspicious self-respecting journalist, I am interested in Tails, an operating system that aims to “preserve privacy and anonymity”. But I didn’t have the inclination to try it until recently.

Tails uses Tor for anonymity, GPG for email encryption, and OTR for chat encryption. You can set them up in standard Linux distribution (like I have done), but in Tails all of these are already prepared for you and ready to use. It is also designed to be run from CD or USB drive, thus won’t leave any trace on the computer.

As the screenshot shows I run it on a virtual machine (VirtualBox instance), which is not really recommended for typical usage. But I am only curious about how it looks. It seems Tails is based on Debian Squeeze, which is really old (it uses, for instance, GNOME 2). But the kernel is rather modern (Linux 3.12), which ensure it to be compatible with latest hardware. The browser is also quite up-to-date (based on Firefox 24).

Because I have only used this OS for two days I haven’t tried feature(s) like persistent storage. I may provide further updates, so stay tuned.

Update: It turns out that persistent storage requires two things: Tails Installer (which is only found in Tails images) and a USB storage, with space larger than 4 GB. I failed at the last requirement, because I only have got 2 GB flash drives. No luck yet.

Wikipedia and voter education

Tags

,

April 9 2014 will be the day of election in Indonesia. From quick observation I find there are still a lot of people who can’t make up their mind yet. Who to choose? Who is the suitable candidates as members of local and national legislatures?

Many will just select the party, not the candidates him/herself. This simplifies things: voters do not need to concern themselves with track records, programmes, etc. Just the party. Just the political tribe. This is not ideal (at least not to me): sometimes the parties do not rank the candidates by their capabilities, but by  connections to party leaders. Good, promising candidates are put in lower ranks. In this case it’s better for voters to choose the candidate directly. In the case of regional representatives (DPD) you can’t choose parties, so information about candidates is even more important.

The Election Commission (KPU) have actually put the information about the candidates on their website. But as usual it is poorly presented, and the information is sparse. Several sites and apps have sprung up to offer better presentation and visualisation, but the data are still mainly sourced from the KPU database.

Here I think Wikipedia can, and should help. It is a crowdsourced platform (so every one can contributes), well-known, and also easy to search. Wikipedians can create entries for politicians and public office holders, and document their successes and flaws. Wikipedia biography articles usually reference public sources (news, government publications, etc), so they are very easy to check. It won’t really replace election apps and sites (such as presented in pemiluapps.org), but it can be complement and perhaps an additional database to such apps.

Unfortunately Indonesian Wikipedia coverage of the country’s politicians is rather sparse, and existing articles are not always of good quality.  Adding and editing politicians biography articles can be a theme for future Wikipedia editathons.  It might be too late for choosing MP candidates, but it still can be done when the results are out and the elected candidates are known. Over the time Indonesian Wikipedia should have better coverage of Indonesian politicians and help the voters to educate themselves in the next election.

Follow

Get every new post delivered to your Inbox.

Join 1,113 other followers