If you have to insert comment on several lines do you do manually insert the comment character in every line? Stop. Vim is a really good editor and has a nice feature to accomplish this quickly. Here are the steps:
A while ago, we discussed how to scrape information from websites that don't offer information in a structured format like XML or JSON. We noted that urllib and lxml are indispensable tools in web scraping. While urllib enables us to connect to websites and retrieve information, lxml helps convert HTML, broken or not, to valid XML and parse it. In this post, I will demonstrate how to retrieve information from web pages that require a login session.
I have created a sample website for this task - http://toscrape.techchorus.net
The website has a page that requires a login session - http://toscrape.techchorus.net/only_authenticated.php
To login to the sample website, use the credentials:
username: admin
password: password
If you visit http://toscrape.techchorus.net/ you will notice that the server sends the response with headers:
Date: Tue, 19 Oct 2010 17:33:43 GMT
Server: Apache mod_fcgid/2.3.5 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/5.2.14
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=c84456d65e5b9da95b09abd4092f860b; path=/
Location: /login.php
Content-Length: 0
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html
In order to maintain the session, you have to send the cookie PHPSESSID with the value c84456d65e5b9da95b09abd4092f860b in all subsequent requests. Of course, the value varies for each user. If the server resets the value of the cookie in a subsequent response, you have to send the updated cookie value in further requests.
Python standard library offers the module cookielib to manage cookies in the client side. We can use this as a cookie jar. In essence, cookielib offers a container to hold cookies. We use this container in urllib2.
Let's start writing the code.
More and more websites are offering APIs nowadays. Previously, we've talked about XML-RPC and REST. Even though web services are growing exponentially there are a lot of websites out there that offer information in unstructured format. Especially, the government websites. If you want to consume information from those websites, web scraping is your only choice.
Web scraping is a technique used in programs that mimic a human browsing the website. In order to scrape a website in your programs you need tools to
In a previous article we discussed how toinstall, remove, update and search for software packages using yum. In this post we discuss how to install only security updates using yum.
The scenario:
Problem: You are trying to use the command svn propedit svn:externals and you are receiving the error:
svn: None of the environment variables SVN_EDITOR, VISUAL or EDITOR are set, and no 'editor-cmd' run-time configuration option was found
Solution: Set vim as your SVN_EDITOR
Command:
export SVN_EDITOR=vim
To permanently set this environment variable put the below line in your ~/.bash_profile file.
export SVN_EDITOR=vim
Did it solve your problem?
<?php
Zend_Controller_Action_HelperBroker::getStaticHelper('helpername');
?>For example, you can access the redirector helper from within your front controller plugin:
<?php
$redirector = Zend_Controller_Action_HelperBroker::getStaticHelper('redirector');
?>Somebody recently asked me how to print the PHP version from within the PHP script. The answer is very simple and requires only two words to print the PHP version. Here is the script
<?php
echo PHP_VERSION;
?>
PHP_VERSION is a predefined constant. It contains the value of the PHP version.
A sample output is as follows.
[sudheer@localhost cli]$ php php_version.php 5.2.6 [sudheer@localhost cli]$
The request object contains the name of the module, controller, action and the request parameters. Sometimes, you might want to access the request object outside the controller or controller plugin.
For example a user on #zftalk just asked
"how can I access request object within form's method?"
The front controller instance is a singleton. This means we can get the instance of the front controller from any part of our application using the static method getInstance().
Is scrolling vertically on web pages in your Firefox horribly slow?
I encountered this issue recently on Fedora 10. Initially, I suspected the binary NVIDIA driver. But I was wrong. I found a simple solution.
Disable smooth scrolling in the Firefox preferences.
Viola.
Recent comments
3 weeks 3 days ago
6 weeks 3 days ago
9 weeks 1 day ago
9 weeks 1 day ago
10 weeks 3 days ago
11 weeks 2 hours ago
13 weeks 1 day ago
14 weeks 9 hours ago
32 weeks 2 days ago
33 weeks 3 days ago