Chapter Name | Highlights |
Introduction |
In the intorduction, you'll learnhow I started writing webbots and spiders in 1996, what to expect from the book, tools you'll need (all open source) and coding standards. |
Part I: Fundamental Concepts and Techniques |
#1 |
What's in It for You? |
Describes webbots can uncover the Internet's true potential |
#2 |
Ideas for Webbot Projects |
Where do ideas for webbots come from?
Read a
sample chapter
at the No Starch Press website. |
#3 |
Downloading Web Pages |
Explores techniques for downloading web pages with PHP built-in functions and PHP/CURL |
#4 |
Parsing Techniques |
Teaches how to effectively parse data from web pages. |
#5 |
Automating Form Submission |
Explains how to write webbots that automatically fill out forms and upload data to remote web servers |
#6 |
Managing Large Amounts of Data |
Describes how to organize and store large amounts of data with compression, tag removal and thumbnailing |
Part II: Projects |
#7 |
Price-Monitoring Webbots |
Shows how to write webbots that monitor prices at online stores |
#8 |
Image-Capturing Webbots |
Describes a project that downloads all the images from a web page |
#9 |
Link-Verification Webbots |
Explores a project that verifies all the links on a web page |
#10 |
Anonymous Browsing Webbots |
Introduces a conceptual project for using webbots to create an anonymous browsing environment |
#11 |
Search-Ranking Webbots |
Explores a webbot that determines the search engine ranking of a web page |
#12 |
Aggregation Webbots |
Explains how to write webbots that combine information from multiple resources, including RSS feeds |
#13 |
FTP Webbots |
Explains how webbots can use FTP as an online resource |
#14 |
NNTP News Webbots |
Explains what NNTP news groups are and how webbots access them |
#15 |
Webbots That Read Email |
Describes methods webbots can use to read email from POP3 Mail Servers |
#16 |
Webbots That Send Email |
Explores methods webbots can use to send email to SMTP Mail Servers |
#17 |
Converting a Website into a Function |
Identifies ways to convert an online service into a PHP function your webbots can call |
Part III: Advanced Technical Considerations |
#18 |
Spiders |
A study of spider theory, with a simple spider project |
#19 |
Procurement Webbots and Snipers |
Explores how webbots automatically buy things from online stores and how snipers bid on online auctions. |
#20 |
Webbots and Cryptography |
Learn how to communicate with websites that use encryption. |
#21 |
Authentication |
Discover various authentication methods and how webbots can auto authenticate into various websites. |
#22 |
Advanced Cookie Management |
Master reading and writing cookies with webbots. |
#23 |
Scheduling Webbots and Spiders |
Learn how to make webbots and spiders launch and run automatically. |
Part IV: Larger Considerations |
#24 |
Designing Stealthy Webbots and Spiders |
Learn when and why its important for your webbots to run without detection. Then learn how to achieve stealth with your webbots. |
#25 |
Writing Fault-Tolerant Webbots |
Discover how to write webbots and parse routines that are "less affected" by changes to the web pages you target. |
#26 |
Designing Webbot-Friendly Websites |
Master Search Engine Optimization as well as methods for communicating data with websites, including light-weight interfaces and SOAP |
#27 |
Killing Spiders |
Gain an understanding of techniques web developers use to discourage the use of automated browsing agents. |
#28 |
Keeping Webbots out of Trouble |
Uncover the dangers of writing disreputable webbots and spiders |
Appendixes |
A |
PHP/CURL Reference |
A handy reference for using PHP/CURL |
B |
Status Codes |
A list of HTTP and NNTP status codes |
C |
SMS Email Addresses |
Address and tips for sending text messages through email |