Showing posts with label Shell scripting. Show all posts
Showing posts with label Shell scripting. Show all posts

Thursday, December 01, 2011

The Notifier

Have you ever been in a situation where you were forced to check out a web page again and again to see whether a particular information appeared there or not? Well, I have been in such a situation where I wanted to see whether my university results were released or not. Checking out the University site every time using a browser was an option. But I decided to automate the process. That's how the following script was born.










After writing this script , I have to automate the running of this script. This was achieved using crontab . I scheduled it in a way that it gets executed every minute. So when the in formation appears in the web page I get notified automatically.
Note : By changing variables URL,Lookfor and Displaymessage any one can customize it for their own use.
Steps to setting up this in a Linux system
Step1 : Save the script with executable permissions. This can be achieved by the command chmod 777 filename
Step 2 : Type in the command crontab -e .Now a file gets opened up ,to which you need to add the following  * * * * *  absolute_path_to_the_script

Saturday, November 12, 2011

Calicut University result summary using Shell & Python scripts

 I wanted to create the summary of the university exam results of my class with details like ranklist, the number of students between different SGPA limits and the subject wise performance of students. I developed Shell script for the first two procedures and a Python script for the first and last procedures. Both the scripts involves the process of crawling through individual results of students. Here is the link

SHELL SCRIPT
 wget  & pdftotext  are the most important tools I have used in this shell script. wget is a tool that can be used to download web pages. Calicut university website prevents the user-agent (eg. wget ) programs from accessing it. So in order to access the result pages , spoofing is required. In that case we will make the website believe that its a browser that's actually accessing it instead of the wget program.This type of spoofing can easily done by the command : wget -U browsername URL. So that is how we downoad individual result pages...
Tip :    wget when used with -r switch will download the web pages recursively.

Calicut university  on its part publishes the individual results in PDF format. Since creating summary requires text processing, it is important that the PDF files are converted to text format. That is where pdftotext comes handy. As name suggests , it is used for converting PDF format to text format files. After downloading and converting the individual results  to text , the summary is just a matter of some amount of text processing.

PYTHON SCRIPT
                       
                  In order to  crawl through individual result , I took the help of Python urllib module.Challenge in this script too was converting  PDF file to text file. Here Python pdfminer module helped me. Along with the PDF miner, a program pdf2txt.py gets installed. This program can be used to convert the pdf file in to text file. As stated earlier the remaining tasks is just all about text processing.