One good side effect is that I had to learn to use many things to come up with data, which has given me good amount of information. I thought of documenting some of these. So here are some things that helped me do it.
- Google
If you are unsure of what exactly you are looking for, i.e. if you are trying to search for a concept or answer to a concept question - try Google. Google is sensitive in understanding the context of the question and does a very good job of giving you a right direction to search.
- Keywords
I spent the better part of days searching on the keywords that were perhaps not the right match. This is because “standard of living” is often treated differently from “living standards”. The search engine that comes close to understanding the similarity is Google; others in my opinion don’t even come close.
So, get the keywords right: and in the right order. Best way to start looking for the keywords, if uncertain, is to look into Wikipedia or some encyclopedia for related articles. Diving into search without a full idea of keyword relation will take too much time and is less productive.
- Google search attributes
Google search strings are really handy in reaching some hidden corners. This will helps you in reducing the amount of search.
- “ ” and +/-
A simple trick is to use the relevant keywords within quotes. When used within quote the keywords are taken literally and searched in the same order. Hence “poverty eradication” will give me the results of pages where pages must have these two words in the same order and in same phrase.
- Filetype:
Another important trick is to use the Filetype. Most of the research documents are published in PDFs. Very few published in Word Document format. If we are looking for a data or information that is mostly likely to be found in research reports or governmental publications and like; we can use this string.
The search string “save tiger” filetype:pdf gives me the results of “save tiger” but only which are in PDF format. This sifts 1000s of otherwise irrelevant documents or web pages.
Such usage is also important because first hundred pages are usually filled up for commercial purpose and the hard-working Professor somewhere would not have thought of search engine optimizations.
If you are looking for data that is most likely is in MS Excel format, we can use filtype:xls
You will be surprised at what Google can give you with these searches. Sometimes, Google can see and scan the servers, accessing files that are otherwise not hyperlinked anywhere else. These files are usually files stored in server or files taken out of public domain but still on server or meant for subscribers.
- Site:
Another useful string is site:. This string filters out the web and searches a particular website. For example: if I search “technology innovation” site:boozallen.com; the results will show me the results only from Booz Allen Hamilton’s website. Since the likely report we are looking for is in PDF format, we can further refine it with filetype, making it look like “technology innovation” filetype:pdf site:boozallen.com
Also, we can filter out the websites based on their level. For example, in some searches we are inundated with University/school results. We can filter them out by using –site:*.edu This will remove all the results that are from .edu.
- InUrl:
Sometimes, in our searches we observe that some keywords or data points are available in the url of the website. For example, if you are searching Edgar Online, you will see the filings are sorted based on years and even months. And these data (years/months) are available in url. Therefore if we are looking for some data for say, 2007; we can search this inurl:2007. This will ensure that “2007” is present in the url, and hence in case of Edgar online, the results will be only from 2007.
- Language
My recent searches involved a German company. No matter how hard I tried, I did not come up with good results; this is mainly because research reports are not freely available as in
- Some more sources
Some search perspectives yield good results about the details about company. Searching for “prospectus” or “RHP” or “rating reports” tend to give more accurate info. Another ample source is Annual reports, which you can get in company website. (Annual report is also a good candidate for inurl search.) The Management discussion is a very good way to understand the companies; which is usually found on the company website or on Edgar Online. Annual reports on Edgar is Form10-K and Q-reports Form10-Q.
- Other search engines
I am a big fan of Google searches. But sometimes, even Google misses out on ‘simplicity’, that the results could be so simple. This point struck home the other day, when I was Googling on something, Google did a good job of giving me results but still not the data I wanted. I punched the same keywords in Yahoo, and I got the first result which was very appropriate. The “popularity contest” search method of Google has disadvantages, hence when stuck, we have to come out of Googlebox, for a breath of fresh air!
- Similarity
Another nifty Google attribute is similarity search. Tilde ~ searches for similarity searches. For example, if I am unsure if a company is a pharma company or an healthcare company (which btw makes good difference in search results) we can use ~pharma, this will give us the results for all the words with strong relation with word pharma.
- Range
If we are looking at the data from a particular span, for example annual report from 1995 to 1997, we can specify by using the search as 1995…1997 This will give the results that have these two dates. Though, above is usually a bad idea, in above example, as it will take the historical financial dates, combined with others, it can be useful.
- Directory search
Try Directory search. Actually, it would be better to start off with a directory search, which simply means somebody has ordered the results for us. This is much painless way of searching.
- Preferences
Use Google preferences to maximum. Having more number of results shown in a single page save lots of time. There is negligible time is loading results but the time saved in quickly scanning the results and loading subsequent pages is huge.
- Historic facts
If you are looking for historic facts or data, it is good to look for it in Google Books. Sites like Guttenberg have historic books/ facts but do not carry full pictures. Google too, does not show all the details (tip: filter out snippet view, its useless unless you are looking for book details, which are pretty rare) but carries more information in terms of footnotes, bibliography, graphs and pictures. Google Books also categorizes these books on subject.
- Google News
Google News is one the best ways to do real-time research. For example, if you look for “crude oil” and sort it by date, the real-time updates will be a surprise. Pick and choose your sources though. Most of them are reproductions, but you keep updated.
- Governmental agencies
The data on the governmental agency sites are a big waste from search perspective. They rarely come on top of any search results. Hence, if you are looking at something specific, it is good to try government sites with *.gov.* search. [tip: only using *.gov filters out the results from say, *.gov.in]
Conclusion
This is no way a complete list. I am still learning new trick everyday or should I say Google teaches new tricks everyday! Ultimately the use of knowledge and opinion on veracity of data lies with the user. It is better to stick with known sites than good data from unknowns.
There are tons of stuff hidden out there, we just have to look for it. If you know any trick, please make a note in the comment.
Comments
Post a Comment