Proto-Indo-European (pIE) is the hypothetical ancestor of these Indo-European languages. It's hypothetical because we have no archaeological evidence that it ever existed, as its Neolithic speakers left no writings. Instead, its existence has been deduced from the study of cognates - sets of similar words with similar meanings. The similarities between Latin frater, Greek phratér, Gothic bróþar, Irish Gaelic bráthair, English 'brother' are obvious, but it wasn't until the 18th century - when European scholars began to study Sanskrit, the classical language of India - that it was realised that languages separated by thousands of miles, and with no history of contact, also shared these characteristics: the Sanskrit is bhrátár. Sir William Jones, a judge in India in the late 18th century, explained the relationship in a classic address to the Asiatick Society of Calcutta in 1786:
The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a strong affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source. [...] There is a similar reason, though not quite so forcible, for supposing that both the Gothick and the Celtick, though blended with a very different idiom, had the same origin with the Sanscrit; and the old Persian might be added to the same family.1
This was all very interesting, but of little use in practice. The great leap forward in Indo-Europeanism came when one crucial fact was established. These languages had developed from Jones's 'common source' through changes in pronunciation, but it was not realised for some time that the changes were regular, invariable, exceptionless. If a 'd' sound becomes a 't' sound, it does so everywhere - if there are exceptions, there will be some explanatory sociolinguistic factor. This, as simple as it is, may not sound like much, but it prefigures one of the most interesting areas of historical linguistics: the reconstruction of proto-Indo-European.
The best-known success of reconstruction goes by the grand name of Saussure's Laryngeal Hypothesis. In the late 19th century, the Swiss linguist Ferdinand de Saussure was examining some irregularities in the formulation of vowels in Indo-European languages. He formulated a theory that pIE had a series of three laryngeal consonants that had been dropped by its descendant languages but left an effect on the vowels that had adjoined them. Then, in the early part of the 20th century, sets of clay tablets were discovered at Amarna in Egypt and Bogazköy in Turkey that, when deciphered, turned out to be written in Hittite, the language of an extinct people only known about from scattered references in ancient texts. Crucially, it had laryngeals exactly where Saussure had predicted they would be.
Despite this kind of accomplishment, the problems of reconstruction are diverse. We have no idea whether any of the reconstructed words actually existed; indeed, the phonetic system established for proto-Indo-European looks nothing like that of any natural language. There may be many pIE words that cannot be reconstructed, simply because they died out in all except one of its descendant groups. Finally, comparative reconstruction has no place for sociolinguistic phenomena such as the prevalence of dialects, vocabulary alteration or cultural influences on language. So we know a lot about what pIE seems to have been like: we know some apparent vocabulary and we have inferred a lot about its phonetics (the speech sounds it used), phonology (how they were put together) and grammar. We deduce that noun forms were inflected according to number, case and gender and verb forms according to number, person, time and mood -- all quite complicated. But we have no way of knowing for sure if these hypotheses are true.
Many theories have been suggested, argued over and discarded: current prevailing wisdom places the 'homeland' on the south Russian steppes, north of the Black Sea, but 'linguistic paleontologists' have argued the case of many other locations. Generally these theories are based on the presumed existence in proto-Indo-European of certain words which allow certain cultural or geographical conclusions to be drawn. For instance, conclusions about the prevailing climate can be deduced from reconstruction of the pIE roots *sneigwh-2 (Latin nix, Greek niphos, Gothic snaiws, Gaelic sneachta, 'snow') and *g'heim (Latin hiems, Greek kheimon, Gaelic geimhreadh, Sanskrit hima, 'cold' or 'winter'). Other inferences can be drawn from the absence of words - there is no pIE reconstruction of 'sea', which suggests that its speakers originated inland and developed individual words for the sea as they encountered it during tribal migration. However, these conjectures are often problematic because it is impossible to accurately model how much the meaning of words may have shifted since the early days of pIE: the reconstructed *loks (German Lachs, Icelandic lax(fiskur), Russian losos, 'salmon') implies that the language was spoken in a region with salmon, which is to say northern Europe or north-western Asia, but the *loks cognate means 'trout' in some languages, even just 'fish' in others.
Given this, it seems unlikely that we will ever be able to deduce the origins of the Indo-Europeans through purely linguistic evidence.