29 March 2019

Getting tensorflow working. Part 1.

And now I am ready to try tensorflow in ubuntu.

First, I wanted to try just on my cpu. Try and burn off that new cpu smell :)
pip install tensorflow
conda install -c conda-forge keras

Start spyder, run my cnn script and... it works. It's a lot faster than on my Mac, but then it should be.

My cpu doesn't get too toasty, hits about 53C. I ran with just a few epochs, and it's noticeably faster (should bring time down by about 2/3 - which seems to make sense, as I'm on 6 core rather than 2).

Now lets try installing tensorflow-gpu.

First off is a graphics driver update. From the standard ubuntu ones to nvidia's. That seems to go reasonably well.

Next, follow instructions at the tensorflow website.

For whatever reason (that was a couple of weeks ago, and i can't remember details), that didn't get me far. It was getting late on Saturday and I only had one night and one day. As a bit of relaxation I was browsing the web, and found a company that has a tf-gpu stack available. A reasonably well known company (I won't say who, as i had a bad experience, and I suspect it was me rather than them, so we'll skip who they are).

I downloaded and ran their docker image. It seemed to do a lot, including installing new nvidia drivers. But that's all in the docker image right, that's to be expected, right? Right?

The update completed, and I rebooted (either it asked me to, or I just thought it would be a good idea). And here is when it started to go wrong. My machine wouldn't boot correctly, it would crap out when displaying started bpfilter. This appears to be quite a common problem with Ubuntu 18.10 upgrades. I've not upgraded to 18.10, I'm sticking to 18.04.

There are quite a few workrounds for this, most of which I tried. I'd boot to safe, apply the fix. reboot. Get the same hang. Also - video only seemed to be showing on my intel built in graphics, nothing from the nvidia card.

I spent all day Sunday trying to resolve my issues, to no avail. Whatever I had done to my machine, it was beyond my ken to resolve.

And, to be honest, seeing as it was a fairly clean install, I maybe should have abandoned it before i did. I know it's good to be able to fix things, but the fix here would be to go back in time and tell myself not to willy-nilly install other peoples tf-gpu stacks.

I abandoned all hope, and ditched it.

The only thing about their stack that really annoyed me - they put their logo on my grub loader. That's just shitty. Everything else I'm more than willing to accept was me blundering through something i don't realise the consequences of, but messing with someones boot loader is just unnecessary and petty, in my opinion. If they want to advertise their stack (which I guess is why you do the image thing), then put chuck a message on the docker container somewhere. That seems far more reasonable, and would be perfectly acceptable. This was possibly exasperated by the fact that grub seemed to do weird things on this machine anyway.

Wait a week (work get's in the way). At least I have Win10, which I can use for remote access to work (the primary reason I have a Win10 install on here).

Next weekend I reran the ubuntu installer, resized my 250GB nvme drive paritions. My complete previous install including any parts of home on that drive was <50GB, so I downsized that to 50GB, and created a new 200GB partition.

Install Ubuntu on to the 200GB partition, and mount the 50GB as /home. It's a less than ideal approach, but for now it'll work.

Anyway - the install went reasonably well. I had to troll through the setup I'd done previously, most of which I'd forgotten, but I just fixed things in the order they annoyed me, or I wanted them (cpu temps and similar).As i only had one more week before a week off, I decided not to mess with tensorflow, I just concentrated on getting as much as possible working again.

Next up - Part 2

No comments:

Post a Comment

And now for a little Cthulhu

I decided to have a little play with word clouds... I found a nice wordcloud library ( https://github.com/amueller/word_cloud ), and a com...