This policy explains what personal information may be gathered and processed by the University's central web search servers, and explains how that information is used.
This policy applies only to processing by the central web search servers. There are other search servers in the University, but you will need to consult the appropriate information on other sites for information on their policies. Many University sites do, however, use the central servers to provide searching, so it is also easy to use the central search servers without realizing it. Results pages from the central servers have URLs starting http://search.cam.ac.uk/
In common with most web sites, the search servers automatically log certain information about every request made of them (see below for more details). This information is used for system administration, for bug tracking, and for producing usage statistics, and most of it may be kept indefinitely. The information will include either the hostname or the network address of the computer making the request, and for authenticated (logged in) requests also the username. The username should identify the person making the request, and a hostname or network address on its own may provide a strong hint. While the network addresses and usernames need to be retained initially, they are deleted after 3-4 months, so there is no personally-identifying information left in the logs, which will then be retained indefinitely to allow year-on-year comparisons.
Relevant subsets of this data may be passed to computer security teams as part of investigations of computer misuse involving this site or other computing equipment in the University. Data may be passed to the administrators of other computer systems to enable investigation of specific problems accessing this site, or of system misconfigurations. Data may incidentally be included in information passed to contractors and computer maintenance organisations working for the University, in which case it will be covered by appropriate non-disclosure agreements. Otherwise the logged information is not passed to any third party except if required by law. Summary statistics are extracted from this data and some of these may be made publicly available, but those that are do not include information from which individuals could be identified.
[You should appreciate that a log is a record of what a server sees, not necessarily what was initially sent. If a request is sent via a proxy the log file may show the proxy's address. If someone has forged your address the log file will show your address. If someone else has logged on using your username and password, the logs will inevitably show you as the person apparently making their requests,]
As part of its normal operation and function, the servers maintain an index of information held on a large number of web servers throughout the University, and return information from that index in response to queries submitted to it. That may include personal data that appears on such pages. For this to happen, the data must already be generally accessible within the University. The data in the index will be automatically updated or removed in line with changes to the original copy (though typically not until a few days, or possibly weeks, later). Web authors and webmasters of sites being indexed by this server can control what data is included in the index - see 'Excluding search engines'.
This site uses web "cookies" as part of user authentication (login) and to store information needed by other facilities. Click tracking stores data anonymously.
The information automatically logged for each request (when searching, or when following result links) is as follows:
- The hostname or network address of the computer making the request. Note that the data recorded may be that of a web proxy rather than that of the originating client
- The username, when known during authenticated (logged in) access to the site
- The date and time of connection
- The HTTP request, which includes details of any search requested, or a link clicked on from search results
- The HTTP status code of the request (success or failure etc.)
- The number of data bytes sent in response
- The contents of the HTTP 'Referrer' header supplied by the user's browser, normally identifying the page that contained the reference to the requested document.
- The content of the HTTP 'User-Agent' header supplied by the user's browser, normally indicating the type and version number of the browser (or other web client) being used and the operating system on which it is being run.
- The outcome of the search (e.g. the number of documents found) and other technical details about how the search request was processed.
Logging of additional data may be enabled temporarily from time to time for specific purposes. In addition, the computers on which the search site is hosted keep records of attempts (authorised and unauthorised) to use them for purposes other than access to the search server. This data typically includes the date and time of the attempt, the service to which access was attempted, the name or network address of the computer making the connection, and may include details of what was done or was attempted.
Access to personal data
For the purpose of the UK Data Protection Act 1998, the 'Data Controller' for the processing of data by this site is the University of Cambridge, and the point of contact for subject access requests is the University Data Protection Officer (The Old Schools, Trinity Lane, Cambridge CB2 1TN, tel. 01223 332320, fax 01223 332332 E-mail: firstname.lastname@example.org).