Random rants

Saturday, 4 April 2015

On Intelligent Life

There's constant search for extra-terrestrial intelligence but could intelligent life have developed on earth in the past?

First let's define intelligent life - dolphins are intelligent, as are ravens, as are octopuses, as are wolves. There are two things that make homo sapiens seemingly unique amongst intelligent species of today - tool usage and communication. Whilst other species have tool-using abilities (ravens, octopuses, the great apes) and rich communication (dolphins and other cetaceans), no other current species combines these two characteristics, and it's allowed humans to dominate the planet. It's unlikely that these two characteristics haven't occurred together in the same species at some point in the earth's evolutionary past.

So, could the kind of intelligence that we're looking for in the SETI programme have existed in the past history of earth?

We need to look at the artifacts left by our civilisation - if humans disappeared tomorrow, what artifacts of our culture would remain for future species to discover? The world without us covers this in significant depth. The plastics we've created should remain as a thin but detectable layer of strata though it's probable that a microbe will evolve to eat most of the synthetic polymers we produce. The other significant relic of human civilisation is that from the nuclear industry - nothing in nature concentrates isotopes the way we do and these will leave a significant signature that should be relatively easy to detect - indeed the discovery of the natural nuclear reactor at Oklo was initially thought perhaps to be the product of an ancient culture. The satellites placed in high earth orbit will probably be there until the sun expands to a red giant, but are extremely hard to detect.

Our culture also produces objects of high durability yet are likely difficult to detect over geologic time - glass and ceramics will eventually be ground to a powder or eroded; stainless steel cutlery and bronze sculpture lasts an incredible amount of time yet will eventually become an unusual deposit in rock strata; Gigantic engineering achievements such as the Suez canal or the Hoover dam would silt up and be gone within a few hundred years.

It's interesting that if we were looking for a civilisation equal to our own in the distant geologic past, we would not be looking for evidence of cities (eroded to unrecognisability in a few millennia), agriculture (reclaimed by native species within decades) or even the spike in atmospheric carbon dioxide (could be due to extremely large volcanic eruptions instead). Instead, we'd be looking for buried bronze sculpture, rubbish dumps, and radioactive waste.

On balance, it's unlikely that a society as advanced as our own has existed on earth before - though the discovery of inhuman bronzes, or unpleasant waters would certainly change that.

Monday, 18 August 2014

Dark matter and an electron-poor universe

This started out as a fantastic idea but quickly I realised that it just didn't stack up! But as a physics idiot, I'll try to demonstrate here how to first set up and then destroy your own theories instead of going down the pathological science route!

Hypothesis

The universe as a whole contains far more nuclei than there are electrons.

What it explains

Dark matter - without electrons the only interactions between nuclei is via the repulsive electromagnetic field between the postitively charged nuclei and should they be travelling fast enough, the strong force. The lack of electrons means no spectral absorption lines, no bonding means little to no clumping together of matter, and nuclei are massive enough to avoid being "hot" dark matter like neutrinos.

A bit more detail/Research needed

When Big Bang recombination occurred, the universe went from opaque - where a free electron would capture any passing photon - to transparent - where the universe had cooled enough to bind electrons to nuclei. If there were a large excess of electrons in comparison to the number of nuclei the universe would not become transparent (it would be "foggy" whereas we see the opposite). If the number of nuclei was the same as, or was in excess of the number of electrons around then we get transparency. Existing theory suggests that the universe is electrically neutral [need cites] - i.e. the number of nuclei matches the number of required electrons. [need to check if there's evidence]

The problems

Why does the local area have an excess of electrons? (possible answer: It's related to the fact that the local area is a supernova remnant). (Maths/reading required).

Ramifications

Early stars composed of primordial BBN materials will be electron deficient. Whilst this doesn't affect fusion, it will affect electron degeneracy pressure at the core of stars - thus no helium flashes/carbon detonation due to the lack of free electrons.

Testable Predictions

In areas where BBN primordial abundances are observed, the amount of lithium and beryllium should be lower than the theory predicts. This isn't because less lithium is actually produced by BBN, it's because one of lithium's electrons sits in the second shell requiring a lower temperature for the electron to bind, making it more likely that H or He would capture that electron beforehand leaving a bare lithium atom that doesn't show it's absorption lines in the CMB.

I don't think we can test the overall positive electrical charge of a galaxy (this theory suggests that anything with a lot of electron-less nuclei around then there would be a large net-positive charge) - there's not enough for it to interact with?

Decay Channels

I started looking in detail at lots of decay channels - but it's actually very simple when you consider conservation of charge. Essentially, every proton converted to a neutron (e.g. in diproton He2 decay to deuterium H2 but other decay channels as well) would emit a positron to conserve charge - this positron would then annhilate with an electron producing gamma ray energies. Very generally, for every neutron you get around here (i.e. earth) then a positron was created in the process that then annhilated with an electron.

Follow Up (Hypothesis goes bang)

Somebody has actually come up with exactly the same theory too but I can't see them addressing the "why is it not like that round here" problem.

This paper points to the universe being electrically neutral during BBN, and in addition to this, we would see different results in the very early universe regarding expansion - it'd be influenced more by electromagnetism rather than gravity. We'd also probably see large scale electromagnetic effects from such a postively charged proton cloud that would wash-out any gravitational effects - and these would probably be observable.

And finally, the most important question "why is it not like that round here" has been utterly trounced - in large supernovae that would produce the elements that we see on earth, we would actually expect vast numbers of electrons to be destroyed in the process - one electron destroyed for every neutron created. Instead, we have an excess of electrons around here. This does lead to a further question - given that large stars and particularly type II supernovae will destroy a vast number of electrons by annhilating them with positrons - and we live in an area where a type II supernova occurred (because we have lots of elements > He around here!) - why do we still have charge balance?

Thursday, 10 July 2014

SSL is broken

There's a magic padlock icon that appears in your browser indicating that you're secure - and that nobody in the middle can read the traffic - and it's probably broken.

Certificate Authorities

The problem is not that the encryption scheme is broken - the public/private keys structure is fine and has been demonstrated to be secure, it's that there are way too many certificate authorities and a single mistake or deliberate outside party interference (for example, from governments) can allow a man in the middle to decrypt all traffic, read what's being sent to and fro.

There's a lot of certificate authorities - most are telecoms companies or related to a national government in some way, and all of these can issue certificates for any website. In addition, it's possible for a certificate authority to issue a wildcard intermediate certificate to an organisation that does exactly the same.

The way an SSL certificate is validated is that when your browser contacts a secure site, the site returns their certificate and a chain up to a valid root certificate. As soon as the browser finds a root certificate that it already knows about then it assumes it's all fine and the connection is secure.
The problem is that if one of these certificates is compromised, or abused in some way by the company owning it, a man in the middle can read all of the traffic in between. This is not a hypothetical situation - it's happened already - with Turktrust, with Nokia, and with DigiNotar. What's worrying is there are lots of certificate authorities and it only takes one of them to be incompetent to render traffic insecure.

There's also other instances of https traffic being decrypted with varying levels of validity - company firewalls occasionally do it by using their own CA authority (which requires modification of each client computer), and anti-virus software with parental controls can also do this. Needless to say I believe these should simply not be allowed. By installing parental controls (on some anti-virus systems) you are effectively giving permission to your anti-virus company to view your bank details, and I don't think most people would be happy with that.

Just as concerning are government security agencies. Whilst the examples above are the result of incompetence, security agencies could go to the certificate authorities directly and request a wildcard certificate - which would probably be granted. But this means that the security agency could happily decrypt all traffic and nobody would be able to detect that they are doing so.

When a root CA is found to be compromised in some way, revoking it is a deeply painful process that can take months whilst each browsers list of root certificates is replaced. Even worse are embedded systems which may never have their CA list refreshed.

Whilst most companies applications won't change, I would recommend that all banking and financial transaction apps use some man-in-the-middle prevention - namely, EKE encryption to detect that this is taking place and prevent data being transferred. Whilst some banks do this already, Natwest does not and it really should!

Quantum Computing

A little further ahead, we have quantum computing, especially with the new we-think-it's-quantum-but-we're-not-really-sure D-wave systems. Using Shor's algorithm and a sufficiently powerful quantum computer, all root certificates could be compromised (again, probably by national security agencies) and again, it would be extremely difficult to detect that a man in the middle attack was being perpetrated. There's something the certificate authorities could do now to combat this, and that's use an algorithm that can't be solved using Shor's algorithm - i.e. not prime number factorization or the discrete logarithm problem - there's other ways but these don't seem to have any take up right now by certificate authorities.

Monday, 24 January 2011

On Minecraft

Sound in games is very difficult to get right - it's a bit like make-up - a little makes everything a lot better, a lot makes things worse. And having none at all is better than too much.

Minecraft gets it right - the sound is functional and pretty spooky at times. When there's a zombie hanging around outside your door or down a passageway that you can't see, the sound of it is incredibly atmospheric, as well as being informative. What you don't get (and would be added in a big budget software house) is the sound of birds, crickets and various paraphenalia that you simply don't need. In some ways the sounds of minecraft reminds me of that from a very old game, Dungeon Master - functional, informative, and downright terrifying at times.

Minecraft also gets music right. Instead of being constantly on, the music fades in and out - there will be long periods with no music at all, but when it does arrive, it's incidental and adds to the experience rather than distracts from it. A bigger budget game would have swamped the music channel with crap that 95% of people turn off. (Seriously, game designers: If I want to listen to music whilst playing a game, *I* will choose what it is, not you). The one proper commercial game that I can think of that does music right is the GTA series - again, it's not always on - it's incidental and doesn't distract.

Wednesday, 6 January 2010

To write a spider, you'll need some kind of batch script that reads the tables and creates the entries in your keywords table. Generally, this is pretty simple, along the lines of (sorry about the horrible pseudo-code, mixture of 4gl and java, sorry!)....

for each customer
split_into_words(customer.name, 100)
split_into_words(customer.address1, 10)
split_into_words(customer.address2, 10)
....etc....
end foreach

for each invoice
store(invoice.number,100)
store(invoice.customer_reference_number,20)
end foreach

.... etc .....

function split_into_words(string, significance)
string[] str = split(string)
for each str
if exists in [keywords table]

update it, adding significance
else
insert into [keywords table]
end foreach
end function

function store(string, significance)
if exists in [keywords table]
update it, adding significance
else
insert into [keywords table]
end function

---------

This then needs to be able to be called from just about anywhere, either respidering a particular customer, or a particular subset of accounts, or a particular subset of invoices, etc.

The section field within the keywords table is an indicator to whatever is reading this table as to where to redirect. eg. If an invoice, put an "I" in the section, the function that reads the keywords table should know to to produce a link to an invoice. If a customer, put a "C" in the section, the reader of the keywords table should produce a link to a customer. Etc, etc, etc.

The difficult parts are

1) assigning significance. If you have customer id 32800 and invoice id 32800, which is more important? Is "Chester Street Motors" more important than a company that's on "Chester Street"? Significance tuning is an art, but if what the user wants is in the top 10, it'll be ok. Important: Watch what users search for, and code your spider accordingly.
2) When and what to respider? Ideally, the spider runs every time something new gets created on the system. You need to be able to respider from anwhere, from any system, whether it be the modern Java or C# or from your legacy cobol systems. Make sure, however you write your spider, you can call it with multiple options (respider this account, these invoices, etc), from anywhere, at any time.
3) Creating the keyword index is an intensive process. You MUST not affect the running of the live system. In practice, when running a full respider, this means creating a temporary table that is then renamed and replaced over the live table (renaming a table, once built, is inexpensive for a database). And, how do you cope with additions during the re-indexing process? I leave that answer to you, or possibly, to google :-)
4) Finally, how do you cope with mis-spellings? For example: Refrigeration Supplies Ltd. Is that spelt right? Are you sure it's not Refridgeration Supplies Ltd? So, it might be spelt wrong in your database, or in the users search. How do you deal with that. (I suggest further reading on Soundex, and Double-Metaphone, I don't really have a great solution for you!!!)

Friday, 7 August 2009

How to do search programmatically

Here's a common problem - you have deep data - you have invoices in multiple different database tables ; you have customer reference numbers somewhere else ; you have customer details in a further table ; you have order numbers; you have products you sell; etc.

How do you provide a single simple search interface that finds exactly what the user wants from a single 'search box'?

Well, it's actually pretty simple. But it does get complicated later on, but don't worry, all problems are surmountable! Let's describe what we need in terms of sql...

create table keywords (
keyword varchar(20),
refernce varchar(20),
section varchar(1),
signifcance integer
);

create unique index i_keywords_1 on keywords (keyword, refernce, section);

---
From this basic keywords table (and yes, add your own unique references as necessary), you can construct highly efficient queries. For example, searching for 'Fleetwood'...

select K1.refernce, K1.section, K1.significance
from keywords K1
where K1.keyword = 'Fleetwood'
order by K1.significance desc
;

this is efficient as an sql query - searches down keywords first. If you entered two words, eg. 'Fleetwood' 'Mac'...

-- AND query...
select K1.refernce, K1.section, (K1.significance + K2.significance) sig
from keywords K1, keywords K2
where K1.refernce = K2.refernce
and K1.section = K2.section
and K1.keyword = 'Fleetwood'
and K2.keywood = 'Mac'
order by sig

-- OR query...
select K1.refernce, K1.section, K1.significance
from keywords K1
where K1.keyword = 'Fleetwood'
union
select K2.refernce, K2.section, K2.significance
from keywords K2
where K2.keyword = 'Mac'
order by 3

--
Again, extremely efficient as an sql query. Any half-decent database server will return the result in miliseconds, and the sql is easy to construct.
--

So, now comes the slightly harder part - populating the data.

You will need a "spider" to find the information and put it in the keyword index. This is a pretty intensive database process, requires some art to write, and can have problems when you want a truly up-to-date index.

More tomorrow.

Friday, 20 March 2009

Wicket

Wicket is a java framework for building web applications which we're using at work, and I'm continually impressed with how easy things are to do with it. I've worked on other java projects that were just horrible though, so maybe my experience with Wicket is one that everyone else has with other frameworks - We've been down the struts approach which means holding data in eight different classes and properties files. We've been down the XML route which made the problem even worse, and we've tried out JSF and found it's really not a mature technology and you end up relying on code in obscure tag libraries.

Wicket has a HTML page with special id tags for every element that you want to be controllable from the java side - for example - td wicket:id="description" , but no JSP references or any of the crap that framework designers like to stuff in the html page making it unusable for anybody else. The HTML renders as html on its own - it's actually possible, in a real situation, to get someone to design the HTML and just plonk it into your project. I've not known this to be the case for any other framework.

For every HTML page that you want to be able to change via code, you have a Java class with the same name that sits directly alongside it. This differs from other frameworks which have the controller somewhere, the JSP page somewhere else, the tag libraries elsewhere, the form object in another area, the form validation somewhere else and the XML controlling page flow in another place. Having the page and the code next to each other makes good sense, and it's hard to see why other frameworks have deviated from this.

Also, no XML, properties or other external files. It's a single change to web.xml to get the whole caboodle to work. So effectively, when creating a new page/application, I'm dealing with two files - the html, and the java code behind that page. That's it. Well, there's still stuff to be done in the database classes to get the page to be able to fetch the data properly, but that's common to every project and I wouldn't have it any other way - the view (in this case, the wicket page) should always be separate from the data layer.

There's no real downside that I can see at this stage. The components are extensible and reasonably easy to use. The code in the java class is readable and easy to understand. The one, possible, slight niggle that I have with it is that it encourages the use of inner classes a lot. But that's not necessarily a bad thing, in fact, I'd rather have the code that executes when I click a button right by where I'm adding it to the form - the code would look something like...

add(new FeedbackPanel("feedback"); // any "error" or "info" stuff automatically appears here
Form form = new Form("monday");
Button b = new Button("submitButton"){
@Override
public void onSubmit(){
// Validation checking here
if(itsMondayMorning)
error("Go away I'm still hungover");
else
db.updateSomething();
}
};
form.add(b);
add(form);

---
Using inner classes happens all the time when doing Wicket - it does make things a bit more readable and the code flow nicely.
I'm told by someone who's worked with Swing components that the logic is similar to that, despite being in a HTML stateless environment. I don't really know having never worked with it, but it seems a doddle to get/set objects on forms.
And even when things get complicated, with fields being made invisible, buttons appearing/disappearing depending on user priveleges, validation being complex because there's a ton of fields or multiple combinations not being valid, it's still easy. It doesn't suffer from the old Visual Basic problem where to get something that looked good took 5 minutes but to get something that actually worked took a lifetime and a lot of Windows API calls.

Upsides: It's easy. It's easy to do difficult stuff and still have readable code. It's easy to bolt your database layer to. Ajax is remarkably easy to do with it. It's easily extensible (too much so?). You can write a full scale working application with just the examples from the wicket book and your code won't be any more complicated than the simple examples. The forum people are wonderful at answering questions.

Possible downsides: I think it's quite heavy on the Session object, but we've had no problems with memory usage or speed so far with a ton of users hammering it at the same time (the bottleneck is the database). Um, oh, I don't like the way it handles custom images, but there's several better ways of doing that than the default. Errrrm, and I don't like the way the URLs look by default (but again, this can be changed).

---
So, if you're struggling with your current Java framework, go and try out wicket. Even if it's not for you, it's probably worth a look.