Using Cookie Jar With urllib2

A while ago, we discussed how to scrape information from websites that don't offer information in a structured format like XML or JSON. We noted that urllib and lxml are indispensable tools in web scraping. While urllib enables us to connect to websites and retrieve information, lxml helps convert HTML, broken or not, to valid XML and parse it. In this post, I will demonstrate how to retrieve information from web pages that require a login session.

Taxonomy upgrade extras: 

What is your definition of a "True PHP Developer?"

Jamie asks on LinkedIn.

The short answer

The question is wrong.

The long answer

A true PHP developer is a programmer who knows PHP. A false PHP developer is someone who doesn't know PHP. That's the closest correct answer I can think of.

I think, Jamie wants to ask, "what's your definition of a good PHP developer?". There is no correct answer to the question. All, you can do is highlight some of the good things a PHP developer does.

Let's seize this opportunity to talk about the traits of a good PHP developer. Most of the things that apply for a discussion about good PHP programmer also applies to a good web developer and good programmer in general.

Taxonomy upgrade extras: 

Event Report - Richard M Stallman Spoke At Reva Institute of Science and Management, Bangalore

The topic was Free Software Movement and GNU/Linux operating system.

It was a long drive to Reva Institute, 40 kilometers from home. I reached the venue in time thanks to the moderate traffic. The third floor was already filled. I had to go to the fourth floor to listen to the speech. The auditorium stage can be viewed from both third and fourth floor. The floor had two elevated blocks, one above the other. There were no chairs on the fourth floor. The floor was a bit dusty. Approximately five hundred people attended the event.
Richard Stallman

The talk was usual as you would expect. RMS started off, explaining the meaning of free software. The four freedoms. Then he talked about the history of the free software movement, FSF, GNU, Linux, Emacs. Even though I am quite familiar with the topics, it was interesting to hear them from the horse's mouth.

Taxonomy upgrade extras: 

Web Scraping With lxml

More and more websites are offering APIs nowadays. Previously, we've talked about XML-RPC and REST. Even though web services are growing exponentially there are a lot of websites out there that offer information in unstructured format. Especially, the government websites. If you want to consume information from those websites, web scraping is your only choice.

What is web scraping?

Web scraping is a technique used in programs that mimic a human browsing the website. In order to scrape a website in your programs you need tools to

  • Make HTTP requests to websites
  • Parse the HTTP response and extract content
Taxonomy upgrade extras: 

Make Your Own Script Appender In Mako Templates

In a recently started Pylons project, I wanted to make an easy script appending facility in Mako templates.

The requirement:

  • base.mako contains the layout of the web page. Many templates inherit base.mako. Here's a snippet from base.mako
        <title>Some title</title>
  • my_page.mako inherits base.mako. From within my_page.mako we want to be able to append script tags in the head section of the web page.
Taxonomy upgrade extras: 

Becoming Productive In Bash Using The Keyboard Shortcuts

Moving around

You can use the arrow keys on keyboard to move around in the command line. Bash also provides convenient keyboard short cuts to navigate effectively. Try them out and see for yourself.

To become a Bash pro user you have to get yourself familiar with the keyboard shortcuts. Once you do, you'll find yourself productive.

CTRL+b move backward one character
CTRL+f move forward one character
ESC+b move one word backward
ESC+f move one word forward
CTRL+a move to beginning of line
CTRL+e move to end of line
CTRL+p move to previous line
CTRL+n move to next line
ESC+ move to first line of history list
ESC+> move to last line of history list

Moving around words using ESC+f and ESC+b are my favourites in this list. Jumping to first and last lines of the history list is also useful.

Taxonomy upgrade extras: 

Programatically Create DateTextBox

How to create dijit.form.DateTextBox widget programmatically

There are two ways to create Dojo's widgets

  1. Declarative
  2. Programmatic

It is often a matter of project preference and personal opinion to decide which approach to take. Many developers are against mixing up markup and JavaScript.

Programmatically creating widgets has its advantages. For instance, you may want to create a date picker when a button is clicked.

In a previous article we discussed how to add a Dojo date picker declaratively. We also showed how to add Dojo date picker without writing a singe line of JavaScript using the Zend Framework.

Let's create a dijit.form.DateTextBox widget programmatically step by step.

FOSS Project: 
Content Type: 

PHP 5 e-commerce Development - Book Review

I was contacted by PackT to review the book PHP 5 e-commerce Development by Michael Peacock.

The book serves as an introductory tutorial on developing an e-commerce website using PHP. The book has 15 chapters covered in 310 pages.

You can grab a sample chapter from the publisher's website.

The publisher's website has a detailed table of contents.

Who should read the book?

You should read the book if you are learning PHP and new to e-commerce. Beginners trying to utilize out of the box software like Drupal CMS or OSCommerce tend to be frustrated sooner or later. These content management systems have their own ways of doing things. Being new to PHP and complex software like Drupal can intimidate you until you thoroughly understand the inner workings of the software. Often developers choose to roll their own software to avoid the steep learning curve of existing open source software. If you have experienced similar feeling you can sure try this book.

Taxonomy upgrade extras: 

A Bit Of XML, RSS And CURL In 7 Lines Of PHP And A Useful Program

Today, I was looking for a quick way to get the current weather information on my computer. There are so many websites out there that offer the information. But I was looking for a program I could permanently install on my computer and launch it whenever I want to lookup the weather information. Oddly, I didn't find any satisfying program. At the same time I was also watching a video about network programming. That inspired me to quickly write a program in PHP to print the current weather information where I live.

I started to look out for a web service that offers information about weather for free. Did I tell you is a useful website to find web services? If you have subscribed to the Tech Chorus blog you know we've been talking about REST, XML-RPC and web services in general for a while. I landed up on the Yahoo! Weather API web page.

I wrote a program to print the weather information in 7 lines of PHP code. I have published this program on Code Album github repository. You can grab it and use it.

If you want to know how to write similar programs, read on. If you know a bit of PHP and have heard about XML and RSS before you can understand the program and start building upon it.

Taxonomy upgrade extras: 

Create RESTful Applications Using The Zend Framework - Part III : Managing API Key

In the first two posts of this series, we discussed how to route REST requests to controllers and return HTTP response code. In this article I will talk about managing API keys.

Having the clients send API key within the HTTP header is convenient to handle. We can quickly check the HTTP request header and decide whether to allow or deny the request.

As a prerequisite you should be familiar writing front controller plugins. Let's write a front controller plugin that does the following:

    Taxonomy upgrade extras: