Blogs

Loading

XML For PHP Developers

On 30 October 2010, the Bangalore PHP User Group met at Microsoft. I gave a talk, XML For PHP Developers, in the event.

I'm sharing the slides in this entry.

It was an introductory talk on XML for PHP developers. There are hundreds of technologies built on top of XML. We have all heard about RSS, Atom, XML-RPC, SOAP, etc. The goal of the talk was to get PHP developers to start using XML. In the talk, I presented three recipes:


Using Cookie Jar With urllib2

A while ago, we discussed how to scrape information from websites that don't offer information in a structured format like XML or JSON. We noted that urllib and lxml are indispensable tools in web scraping. While urllib enables us to connect to websites and retrieve information, lxml helps convert HTML, broken or not, to valid XML and parse it. In this post, I will demonstrate how to retrieve information from web pages that require a login session.

I have created a sample website for this task - http://toscrape.techchorus.net

The website has a page that requires a login session - http://toscrape.techchorus.net/only_authenticated.php

To login to the sample website, use the credentials:
username: admin
password: password

If you visit http://toscrape.techchorus.net/ you will notice that the server sends the response with headers:

Date: Tue, 19 Oct 2010 17:33:43 GMT
Server: Apache mod_fcgid/2.3.5 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/5.2.14
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=c84456d65e5b9da95b09abd4092f860b; path=/
Location: /login.php
Content-Length: 0
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html

In order to maintain the session, you have to send the cookie PHPSESSID with the value c84456d65e5b9da95b09abd4092f860b in all subsequent requests. Of course, the value varies for each user. If the server resets the value of the cookie in a subsequent response, you have to send the updated cookie value in further requests.

Python standard library offers the module cookielib to manage cookies in the client side. We can use this as a cookie jar. In essence, cookielib offers a container to hold cookies. We use this container in urllib2.

Let's start writing the code.


What is your definition of a "True PHP Developer?"

Jamie asks on LinkedIn.

The short answer

The question is wrong.

The long answer

A true PHP developer is a programmer who knows PHP. A false PHP developer is someone who doesn't know PHP. That's the closest correct answer I can think of.

I think, Jamie wants to ask, "what's your definition of a good PHP developer?". There is no correct answer to the question. All, you can do is highlight some of the good things a PHP developer does.

Let's seize this opportunity to talk about the traits of a good PHP developer. Most of the things that apply for a discussion about good PHP programmer also applies to a good web developer and good programmer in general.


Event Report - Richard M Stallman Spoke At Reva Institute of Science and Management, Bangalore

The topic was Free Software Movement and GNU/Linux operating system.

It was a long drive to Reva Institute, 40 kilometers from home. I reached the venue in time thanks to the moderate traffic. The third floor was already filled. I had to go to the fourth floor to listen to the speech. The auditorium stage can be viewed from both third and fourth floor. The floor had two elevated blocks, one above the other. There were no chairs on the fourth floor. The floor was a bit dusty. Approximately five hundred people attended the event.
Richard Stallman

The talk was usual as you would expect. RMS started off, explaining the meaning of free software. The four freedoms. Then he talked about the history of the free software movement, FSF, GNU, Linux, Emacs. Even though I am quite familiar with the topics, it was interesting to hear them from the horse's mouth.


Web Scraping With lxml

More and more websites are offering APIs nowadays. Previously, we've talked about XML-RPC and REST. Even though web services are growing exponentially there are a lot of websites out there that offer information in unstructured format. Especially, the government websites. If you want to consume information from those websites, web scraping is your only choice.

What is web scraping?

Web scraping is a technique used in programs that mimic a human browsing the website. In order to scrape a website in your programs you need tools to

  • Make HTTP requests to websites
  • Parse the HTTP response and extract content

Make Your Own Script Appender In Mako Templates

In a recently started Pylons project, I wanted to make an easy script appending facility in Mako templates.

The requirement:

  • base.mako contains the layout of the web page. Many templates inherit base.mako. Here's a snippet from base.mako
    <html>
    <head>
        <title>Some title</title>
        <script>...</script>
        <script>...</script>
    </head>
     
    </%def>
  • my_page.mako inherits base.mako. From within my_page.mako we want to be able to append script tags in the head section of the web page.

Becoming Productive In Bash Using The Keyboard Shortcuts

Moving around

You can use the arrow keys on keyboard to move around in the command line. Bash also provides convenient keyboard short cuts to navigate effectively. Try them out and see for yourself.

To become a Bash pro user you have to get yourself familiar with the keyboard shortcuts. Once you do, you'll find yourself productive.

CTRL+b move backward one character
CTRL+f move forward one character
ESC+b move one word backward
ESC+f move one word forward
CTRL+a move to beginning of line
CTRL+e move to end of line
CTRL+p move to previous line
CTRL+n move to next line
ESC+< move to first line of history list
ESC+> move to last line of history list

Moving around words using ESC+f and ESC+b are my favourites in this list. Jumping to first and last lines of the history list is also useful.


PHP 5 e-commerce Development - Book Review

I was contacted by PackT to review the book PHP 5 e-commerce Development by Michael Peacock.

The book serves as an introductory tutorial on developing an e-commerce website using PHP. The book has 15 chapters covered in 310 pages.

You can grab a sample chapter from the publisher's website.

The publisher's website has a detailed table of contents.

Who should read the book?

You should read the book if you are learning PHP and new to e-commerce. Beginners trying to utilize out of the box software like Drupal CMS or OSCommerce tend to be frustrated sooner or later. These content management systems have their own ways of doing things. Being new to PHP and complex software like Drupal can intimidate you until you thoroughly understand the inner workings of the software. Often developers choose to roll their own software to avoid the steep learning curve of existing open source software. If you have experienced similar feeling you can sure try this book.


A Bit Of XML, RSS And CURL In 7 Lines Of PHP And A Useful Program

Today, I was looking for a quick way to get the current weather information on my computer. There are so many websites out there that offer the information. But I was looking for a program I could permanently install on my computer and launch it whenever I want to lookup the weather information. Oddly, I didn't find any satisfying program. At the same time I was also watching a video about network programming. That inspired me to quickly write a program in PHP to print the current weather information where I live.

I started to look out for a web service that offers information about weather for free. Did I tell you programmableweb.com is a useful website to find web services? If you have subscribed to the Tech Chorus blog you know we've been talking about REST, XML-RPC and web services in general for a while. I landed up on the Yahoo! Weather API web page.

I wrote a program to print the weather information in 7 lines of PHP code. I have published this program on Code Album github repository. You can grab it and use it.

If you want to know how to write similar programs, read on. If you know a bit of PHP and have heard about XML and RSS before you can understand the program and start building upon it.


Create RESTful Applications Using The Zend Framework - Tutorial Series

Create RESTful Applications Using The Zend Framework - Tutorial Series


Syndicate content