Die Präsentation wird geladen. Bitte warten

Die Präsentation wird geladen. Bitte warten

What is Squid? A caching proxy for Supports transparent proxying

Ähnliche Präsentationen


Präsentation zum Thema: "What is Squid? A caching proxy for Supports transparent proxying"—  Präsentation transkript:

1

2 What is Squid? A caching proxy for Supports transparent proxying
HTTP, HTTPS (tunnel only) FTP Gopher WAIS (requires additional software) WHOIS (Squid version 2 only) Supports transparent proxying Supports proxy hierarchies (ICP protocol)

3 Proxy Servers A proxy server serves at least two functions
it offers an extended cache to the local users so that multiple users who access the same pages get a savings it offers control over what material can be brought into the organization’s network and thus on to the clients for instance, it can filter material for viruses it can also filter material to disallow access to pornography, etc other functions that it can serve include an authentication server performing SSL operations like encryption and decryption collecting statistics on web traffic and usage additionally, the proxy server can offer an added degree of anonymity in that it is the proxy server that places requests of remote hosts, not an individual’s computer thus, the IP addresses sent to servers is that of the proxy server not of the client

4 What is a proxy? What is a caching proxy?
Firewall device; internal users communicate with the proxy, which in turn talks to the big bad Internet Gate private address space (RFC 1918) into publicly routable address space Allows one to implement policy Restrict who can access the Internet Restrict what sites users can access Provides detailed logs of user activity What is a caching proxy? Stores a local copy of objects fetched Subsequent accesses by other users in the organization are served from the local cache, rather than the origin server Reduces network bandwidth Users experience faster web access

5 Transparent proxying Router forwards all traffic to port 80 to proxy machine using a route policy Pros Requires no explicit proxy configuration in the user’s browser Cons Route policies put excessive CPU load on routers on many (Cisco) platforms Kernel hacks to support it on the proxy machine are still unstable Often leads to mysterious page retrieval failures Only proxies HTTP traffic on port 80; not FTP or HTTP on other ports No redundancy in case of failure of the proxy Recommendation: Don’t use it! Create a proxy auto-configuration file and instruct users to point at it If you want to force users to use your proxy, either Block all traffic to port 80 Use a route policy to redirect port 80 traffic to an origin web server and return a page explaining how to configure the various web browsers to access the proxy

6 squid.conf runtime settings
Default squid.conf file is heavily commented! Read it! Must set cache_dir (one per disk) cache_peer (one per peer) if participating in a hierarchy cache_mem (8-16M preferred, even for large caches) acl rules (default rules mostly work, but must reflect your address space)

7 squid.conf ACL example acl manager proto cache_object
acl localhost src /32 acl managerhost src /32 acl managerhost src /32 acl managerhost src /32 acl cawtech src /24 acl cawtech-internal src /16 acl all src / acl SSL_ports port acl gopher_ports port 70 acl wais_ports port 210 acl whois_ports port 43 acl www_ports port 80 81 acl ftp_ports port 21 acl Safe_ports port acl CONNECT method CONNECT acl FTP proto FTP acl HTTP proto HTTP acl WAIS proto WAIS acl GOPHER proto GOPHER acl WHOIS proto WHOIS http_access deny manager !localhost !managerhost http_access deny CONNECT !SSL_ports http_access deny HTTP !www_ports !Safe_ports http_access deny FTP !ftp_ports !Safe_ports http_access deny GOPHER !gopher_ports !Safe_ports http_access deny WAIS !wais_ports !Safe_ports http_access deny WHOIS !whois_ports !Safe_ports http_access allow localhost http_access allow cawtech http_access allow cawtech-internal http_access deny all

8 Caching Caching uses faster hardware to save information (code or data) that you have used recently so that, if you need it again, it takes less time to access for processing a program, caching takes place in cache memory, which is either stored on the CPU, or on the motherboard storage is typically for a very brief time period (fractions of a second) for secondary storage, caching is stored in a buffer on the hard disk storage is typically until there are new hard disk accesses for web access, caching is stored on the hard disk itself storage is typically for about a month if the information being stored is static (dynamic web content is usually not cached)

9 Forward vs Reverse Proxies
The typical form of proxy server is the forward proxy a collection of browsers (on the same LAN, or within an organization) share the same proxy server all client requests go to the proxy server the server looks in its cache to see if the material is available if not, the server looks to make sure that the request can be fulfilled (does not violate any access rules), and sends the request over the Internet once a response is received, the server caches it and responds to the client A reverse proxy server is used at the server end of the Internet responses from the Internet come into the proxy server which then determines which web server to route the request on to this might be used to balance the load of many requests for a company that runs multiple servers it also allows the proxy server to cache information and respond directly if the requested page is in its cache we’ll consider reverse proxy servers in a bit

10 Commands / Comments If you want to run Squid upon booting
you might add the start-up command to a script in rc.d, init.d or inittab Many people do not like running Squid in the main OS environment for security purposes, just as you might not want to run apache in the main OS environment, therefore they create a chroot environment this is a new root filesystem directory separate from the remainder of the filesystem anyone who hacks into squid will not be able to damage your file system, only the chroot environment The safest way to shut down Squid is through squid –k shutdown do not use kill To reconfigure Squid after changing squid.conf run squid –k reconfigure, this prevents you from having to stop/restart squid To rotate Squid log files, use squid –k rotate put this in a crontab to rotate the files every so often (e.g., once a day)

11 ACLs in Squid Since apache can be used as a proxy server, you might wonder why use squid? squid allows you to define access control lists (acls) which in turn can then be used to specify rules for access who should be able to access web pages via squid? what pages should be accessible? are there restrictions based on file name? web server? web page content or size? what pages should be cached? what pages can be redirected? such rules are defined in two portions acl definition (similar to what we saw when defining accessors in bind) followed by an access statement (allow or deny statements) Squid offers a variety of acl definition types IP addresses IP aliases URLs User names (requiring authentication) file types

12 ACL - Example The most common form of acl is to define and permit access to specific clients we will define some src (source IP address) acls typically with src, we define specific IP addresses or subnetworks (rather than IP aliases) acl src localhost here, we define the source acl “localhost” to be the IP address acl src mynet 10.2/16 this could also be /16 Now we use our acls to allow and deny access http_access allow localhost http_access allow mynet http_access deny all here, we are allowing access only from localhost and those on “mynet”, everyone else is denied order of the allow and deny statements is critical, we will explore this next time

13 Types of ACLs I Aside from src, you can also specify ACLs based on
src – the IP address of the user (client) whose requests are going from their browser to the squid proxy server dst – the URL of the web server (destination) srcdomain and dstdomain – same as src and dst except that these permit IP aliases srcdom_regex and dstdom_regex – same as srcdomain and dstdomain except that the IP aliases can be denoted using regular expressions time – specify the times and days of the week that the proxy server allows or denies access port, method, proto – specify the port(s) that the proxy server permits access, the HTTP methods allowable (or denied) and the protocal(s) allowable (or denied) rep_mime_type – allow or deny access based on the type of file being returned we will study these (and others) in detail next time myip – same as src, but it is the internal IP address rather than (possibly) an external IP address arp – access controlled based on the MAC address

14 Types of ACLs II port – specify one or more port numbers
ranges separated by – as in multiple ports are separated by spaces or on separate definitions typically, you will define “safe” ports and then disallow access to any port that is not safe, for example: acl port safe_ports http_access deny !safe_ports method – permissible HTTP method GET, POST, PUT, HEAD, OPTIONS, TRACE, DELETE squid also knows additional methods including PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, CONNECT and PURGE acl method allowable_method GET HEAD OPTIONS http_access deny !allowable_method proto – permissible protocol(s) http, https, ftp, gopher, whois, urn and cache_object ex: acl proto myprotos HTTP HTTPS FTP proxy_auth – requires user login and a file/database of username/passwords you specify the allowable user names here, such as acl proxy_auth legal_users foxr zappaf newellg maxconn – maximum connections you can control access based on a maximum number of server connections this limitation is per IP address, so for instance you could limit users to 25 accesses, once the number is exceeded, that particular IP address gets “shut out”

15 Time ACLs To control when users can access the proxy server, based on either days of the week, or times (or both) S, M, T, W, H, F, A for Sunday – Saturday, D for weekdays time specified as a range, hh:mm – hh:mm in military time The format is acl name time [day(s)] [hh:mm - hh:mm] example: to specify weekdays from 9 am to 5 pm: acl weekdays time D 09:00 – 17:00 example: to specify Saturday and Sunday: acl weekend time SA The first time must be less than the second if you want to indicate a time that wraps around midnight, such as 9:30 pm to 5:30 am, you have to divide this into two definitions (9:30 pm – 11:59 pm, and 12:00 am – 5:30 am) if days have different times, you need to separate them into multiple statements, such as wanting to define a time for M 3-7 and W 3-8 would require two definitions

16 More ACLs and Regular Expressions
As stated earlier, you can specify regular expressions in srcdom_regex and dstdom_regex There are also regex versions to build rules for the URL url_regex and urlpath_regex for the full URL and the path (directory) portion of the URL respectively you might use this to find URLs that contain certain words, such as paths that include “bin”, or paths/filenames that include words like “porn” ident_regex to apply regular expressions to user names after the squid server performs authentication

17 Other ACL Types req_mime_type and rep_mime_type
test content-type in either the request or response header it only makes sense to use req_mime_type when uploading a file via POST or PUT example: acl badImage rep_mime_type image/jpeg Browsers restrict what type(s) of browser can make a request External ACLs this allows Squid to sort of “pass the buck” by requesting that some outside process(es) get involved to determine if a request should be fulfilled or not external ACLs can include factors such as cache access time, number of children processes available, login or ident name, and many of the ACLs we have already covered, but now handled by some other server

18 User Names & Authentication
The ident acl can be used to match user names The proxy_auth acl can specify either REQUIRED or specific users by name that then require that a user log in authentication requires that the user must perform a username/password authentication before Squid can continue any request that must be authenticated is postponed until authentication can be completed although authentication itself adds time, using ident or proxy_auth also adds time after authentication has taken place because Squid must still look up the user’s name among the authentication records to see if the name has been authenticated Squid itself does not come with its own authentication mechanisms, so we have to add them as modules much like with apache

19 Log Files As with Apache, Squid uses log files to store messages of importance and to maintain access and error logs however, one additional log that Squid has that Apache does not is a cache log in order to record what files are cached there are also optional log files available useragent.log and referer.log which contain information about user agent headers and web referers for every access swap.state and netdb_statestore information regarding the disk and network performance of Squid you can control the names of the log files and which of these optional log files are used through directives in your conf file because there are so many logs and they can generate a lot of content, there are log rotation tools available just as with Apache

20 cache.log This log contains configuration information
warnings about performance problems errors Entries are of the form date time | message Configuration messages might include such things as process ID of a starting squid process successful (or failed) tests to the DNS and the DNS IP address (as obtained from resolv.conf) starting helper programs The remaining cache entries are made based on a specified debug level that dictate which types of operations should be logged here normal information, warnings, errors, emergencies, etc

21 access.log Much like Apache’s access log, Squid’s access log will store every request received each entry contains 10 pieces of information timestamp response time client address status code of request size of file transferred HTTP method URI client identity (if available) how requests were fulfilled on a cache miss (that is, where we had to go to get the file) content type status codes differ from Apache as they indicate cache access as well as server status codes, and include these: TCP_HIT, TCP_MISS, TCP_REFRESH_HIT, TCP_REF_FAIL_HIT, TCP_REFRESH_MISS, TCP_CLIENT_REFRESH_MISS, TCP_IMS_HIT, TCP_SWAPFAIL_MISS, TCP_NEGATIVE_HIT, TCP_MEM_HIT, TCP_DENIED, TCP_OFFLINE_HIT, TCP_REDIRECT and NONE

22 Directives for access.log
log_icp_queries – default is enabled, allows you to control whether ICP (Internet Cache Protocol) requests are logged or not emulate_http_log – whether to use the same format as http server access logs (that is, match Apache’s server log) or use Squid’s native format which contains more information log_mime_hdrs – if set to on, Squid will add HTTP request and response headers to each log entry (this adds two more fields to each entry) log_fqdn – this toggles whether Squid records requests by destination IP address or hostname – if hostname, then Squid has to do a reverse DNS lookup which takes more time log_ip_on_direct – same as above except whether to log client’s (requestor’s) IP address or hostname strip_query_terms, uri_whitespace – whether to remove the query terms from an URL and whether to strip, chop, or encode white space in a URL (if any)

23 Store.log The store.log file stores decisions to store and remove objects from the Squid cache if an object is cached, the entry includes where it was cached and when if an object is uncacheable, then the entry indicates why the object was uncacheable if a cache is full, a replacement strategy is used to decide what to remove, and any such action is logged here The store log contains the following fields: timestamp, action (SWAPOUT, RELESE, SO_FAIL), directory number (which cache), file number, cache key (the hash value of the object), status code, date, last_modified from the HTTP response header, expires, content-type, content-length/size, HTTP method and URI

24 Sample proxy auto-configuration (wpad.dachser.com)
function FindProxyForURL(url, host) { if (isPlainHostName(host) || dnsDomainIs(host, ".cawtech.com")) return "DIRECT"; if ((url.substring(0, 5) == " || (url.substring(0, 6) == " || (url.substring(0, 4) == "ftp:") || (url.substring(0, 7) == "gopher:")) return "PROXY proxy.cawtech.com:3128; DIRECT"; }


Herunterladen ppt "What is Squid? A caching proxy for Supports transparent proxying"

Ähnliche Präsentationen


Google-Anzeigen