So I'm an academic researcher in the early stages. You start by learning the general basics and then you pick a "field" within the broader subject. Slowly, within your field you specialize by learning a couple of tools better and better.....hopefully, by thinking about what kinds of questions you are interested in and investing the time and energy to master certain approaches.
Frequently, if you are smart, you look at academic papers that are well published or have lots of citations or both and use them as sign posts to make decisions. One of my friends just had a disappointing job market result because the choices he made have turned out to be a minor bubble that popped at the exact moment that he went on the market. A topic and approach that was very hot and exciting a few years ago has now drawn in a bunch of people who have done a lot of stuff and the feeding frenzy hiring of the last year or two has left everywhere that wanted to hire one of those kinds of guys having hired one of those kinds of guys. One of my office mates has literally invest the time and energy to master the most boring purely mathematical statistical junk I can imagine, so that he can write papers on the deep theoretical underpinnings of statistical distributions because these hard to read and harder to write papers publish really well (probably because the handful of authors are the only people who can read and judge them, thereby creating an perverse incentives problem for journals). He pretends to be genuinely interested in this stuff, but after a few drinks, he will admit that he finds it all as boring and pointless as everyone else and that he is just whoring for publication.
So this morning, I got some data, some very specific data. There then is a sub-literature that I have invested my time in learning the ins and out of. There are only a handful of papers out there on this little sub-subject. Most of them are not terribly wonderful, but they have massive citation counts. So the math and concepts don't seem too hard.....easy enough for a dumb dumb like me to handle. And I'm actually interested in the implications of this set of topics with the added benefit of feeling like I can actually do a decent job of attacking some of the issues. So less than great papers with 1600 citations make me think that I can write something better and get citation counts like that.
I've spent a bit of effort constantly bugging a guy since August (for those of you who have played me before, I'm consistently obnoxious in real life too in order to get what I want and it usually work eventually) for access to the data that he based this 1600 citation paper on. The data is from a professional service and ridiculously expensive, like NSF grant money level expensive. Between August and now, I've spent far more time pulling data from a half dozen sources electronically and using the wonders of python to aggregate it into the form that I need. But my data set is up to date being a reflection of the world circa say 2010, and the data the guy caved in and gave me this morning is from 2006, which actually let me do things through time which could be super cool. I cannot imagine how I could possibly get old data for what I'm looking at. Going through the expensive data, it is garbage. I have no idea whether my data is accurate: if the data at any stage was collected well; whether my stitching stuff together caused errors; whether it is complete enough to be meaningful; whether it reflects the world as it really is. The data I got this morning is not even internally consistent. I immediately spotted things that were missing in it. Within five minutes of looking at it superficially, I spotted internal inconsistencies. To compare it to a completely unrelated topic, imagine one of those mileage charts on maps that tell you how far the drive from Chicago to Detroit is. It is hard to know whether the number listed is "right" but you know that Chicago to Detroit should bear some close resemblance to Detroit to Chicago. This data set has 30% differences in values that should be equal.
The paper that was written of this was not that great. It says some interesting things, but it is "important" because the data is exclusive, not because what was done with it was amazing. The price quote I got from the "professional" consulting firm that sells the data was flabbergasting. So a 1600 citation academic paper that in many ways is the definitive addressing of a specific topic is based on bad data and a commercial information service is charging huge fees for clearly incomplete data that isn't even internally consistent.
The more you learn the more you see the "experts" cutting corners out of laziness and getting away with it totally. Politicians are obviously an entirely different manner, but probably every academic harbors some sort of borderline fascist tendency to dream of a world in which "enlightened despots" free from consensual governance would be free to listen to and be guided by "the experts" in designing "progressive" public policy. I would think that the one good thing that would come out of that kind of a mindset would be a seriousness in being sure about things and judging academic research rigorously.