[Big Data Application] HTTP
HTTP and Sessions
- The HTTP protocol is connectionless
- That is, once the server replies to a request, the server closes the connection with the client, and forgets all about the request
- In contrast, Unix logins, and JDBC/ODBC connections stay connected until the client disconnects : retaining user authentication and other information
- Motivation: reduces load on server : operating systems have tight limits on number of open connections on a machine
- Information services need session information
- E.g. user authentication should be done only once per session
- Solution: use a cookie
Sessions and Cookies
- A cookie is a small piece of text containing identifying information
- Sent by server to browser on first interaction
- Sent by browser to the server that created the cookie on further interactions : part of the HTTP protocol
- Server saves information about cookies it issued, and can use it when serving a request : E.g., authentication information, and user preferences
- Cookies can be stored permanently or for a limited time
Servlets
- Java Servlet specification defines an API for communication between the Web server and application program
- E.g. methods to get parameter values and to send HTML text back to client
- Application program (also called a servlet) is loaded into the Web server
- Two-tier model - Each request spawns a new thread in the Web server
- Servlet API provides a getSession() method
- Sets a cookie on first interaction with browser, and uses it to identify session on further interactions
- Provides methods to store and look-up per-session information
Server-Side Scripting
- Server-side scripting simplifies the task of connecting a database to the Web
- Define a HTML document with embedded executable code/SQL queries.
- Input values from HTML forms can be used directly in the embedded code/SQL queries.
- When the document is requested, the Web server executes the embedded code/SQL queries to generate the actual HTML document.
- Numerous server-side scripting languages
- JSP, Server-side Javascript, ColdFusion Markup Language (cfml), PHP, Jscript
- General purpose scripting languages: VBScript, Perl, Python
Improving Web Server Performance
- Performance is an issue for popular Web sites
- May be accessed by millions of users every day, thousands of requests per second at peak time
- Caching techniques used to reduce cost of serving pages by exploiting commonalities between requests
- At the server site: Caching of JDBC connections between servlet requests , Caching results of database queries , Cached results must be updated if underlying database changes , Caching of generated HTML
- At the client’s network Caching of pages by Web proxy