Solved

Ubuntu 13.10 installation error ??? ???

Posted on 2014-02-18
15
1,495 Views
Last Modified: 2014-02-19
The end goal is to build an Ubuntu 13.10 Compute unit for CUDA 5.5 using the nVidia Tesla K40.

Hardware:-

ASUS P9X79 Motherboard
64GB RAM (8 x 8GB Kingston Hyper Beast KHX24C11T3K2/16X)
Intel I7 4960X CPU
EVGA nVidia GT610 GPU
nVidia K40 Tesla
1 x OWC Mercury Accelsior 1TB PCI-e  SSD
2 x SanDisk Ultra Plus SSD 256GB (SDSSDHP256G)

(RAID1 attached to motherboard in IRST mode, because we have found that RST does not function at all correctly!)

All firmwares have been updated, and this rig above was soak tested for 72 hours, with Windows 7 with no issues. (including running SETI on the K40!).

The configuration required is to use the two SanDisk SSDs in RAID1 for the OS, and use the 1TB PCI-e card (this can only fit in one slot!) for data.

BIOS set to defaults, except SATA changed to RAID1, and IRST mode, because not compatible with Ubuntu.

Install performed from USB pen and CDROM using Ubuntu 13.10 Desktop.

PC is connected to internet, tried selecting Download Updates whilst installing and install third party software.

Disks have been erased and wiped using DBAN.

Selecting Erase Disk and Install Ubuntu continues, recognizes the SATA RAID, continuing to the Time Zone (location) selection...

it fails with Error ??? ???

Ubuntu1310error
I have read this could be, you need to create manual partition tables, but this has worked, on this machine, but we had another issue! (later!).

So why the errror?

Later we've had issues, hence trying to re-install, that the RAID1 configuration on the motherboard reset, resulting in OS lost! (requring re-install!)

(and we have two machines identical, and they behave the same, other than, if we remove ALL SSDs in RAID1, the ??? ??? goes away, e.g. install on 1TB PCI-e card.

This error does not occur with 12.04 LTS, the installation completes successfully, but the installation does NOT Boot!

Comments, welcome, and points for a fix!
0
Comment
  • 8
  • 5
  • 2
15 Comments
 
LVL 61

Expert Comment

by:gheist
Comment Utility
CUDA 5.5 supports ONLY  one currently supported version - Ubuntu LTS 12.04.3 (NOT kernel 3.11 12.04.4)
Note that 12.04.2 and 13.04 are oldest releases that will install on UEFI system (like your motherboard)
CentOS 6 ( called RrtHEL6 there) is another option still in supported train.
Also start with minimal version, you can use ubuntu tasksel to add unity later.
Make sure you dont soak test, but run memtest86(+) for those 3 days
And after you install Linux run (yes > /dev/null) & ion each CPU for a day or two.
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
Thanks for your reply, but it does not get me near a solution at present, we've spent three man days on this, and we are likely to start dumping SSDs (RAID1) and 1TB PCI-e card, to get an installation of Ubuntu 13.10 (and then fun starts with CUDA, and may have to drop to 13.04/12.04)

Yes, we are aware of the limitation of CUDA 5.5 on 13.10, but this is the requirement, we are told at present!

(we are no where near this at present, that's more fun to come, because of these issues, with installation).

12.04 as noted, installs, no error, but does not boot.

13.10 causes these errors as above, but when SSDs are removed, it works on the 1TB card, but this was supposed to be for storage not OS.
0
 
LVL 61

Expert Comment

by:gheist
Comment Utility
12.04.2 aka kernel 3.5 is minimal to boot on UEFI
See here 12.04.4:
http://releases.ubuntu.com/precise/

You cannot run CUDA on 13.10 (kernel 3.11)
It will run on 13.04/12.04.3 (kernel 3.8)

Whatever you are told just will not work at all.
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
So 13.10 should work?

Why the ??? ??? On installation why do the ??? ??? Disappear on removal of SSDs
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
At present we need to install 13.10. It does not with system as above.

This is the requirement.

If CUDA 5.5 does not function as seen in some threads we will then have to advise. But that is Part 2 of the issue.
0
 
LVL 61

Expert Comment

by:gheist
Comment Utility
Use server or alternate  media to install in text mode, error seems like graphical toolkit problem, not anything technical. (and also shows non-LTS versions are not so well elaborated)

Really you go to bin 6000$ compute card? I can give you my address to send it to for proper dispsal...
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
We have two K40s.

If 13.10 cannot be made to comply with CUDA 5.5 we will advise client accordingly but cannot at present provide working proof without working 13.10 system.

I suspect install issue is storage or BIOS related.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 27

Expert Comment

by:serialband
Comment Utility
Have you tried breaking the RAID and just installing on a single disk first to test isolate the RAID hardware?
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
Yes does not work same error.

If we remove All Sandisk SSDs an use PCI-E 1TB card works okay. And vice versa works okay but combined does not work.
0
 
LVL 27

Expert Comment

by:serialband
Comment Utility
Is there an interrupt conflict or memory space conflict between the 2 in BIOS?

Can you just install the system with just the RAID first, configure it and make sure it boots?  If it works all the way through, then maybe you can add the  PCI-E 1 TB card after everything is installed.
0
 
LVL 61

Expert Comment

by:gheist
Comment Utility
Make it IBM way - install on a USB stick
Or the HP way - use the (micro-)SD card ;)
Why would you mirror system boot that takes 10-20min to reinstall...
0
 
LVL 117

Author Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
@serialband Yes, we've tried that as well, and after inserting the PCI-E 1 TB card, all is well with a working system, and on the 3rd reboot laster, the RAID configuration on the motherboard "resets", leaving a blank system.

@gheist, there is no option on the motherboard for SD card or USB installation, and it's also part of the brief and requirement for Mirrored RAID 1 OS installation.
0
 
LVL 61

Expert Comment

by:gheist
Comment Utility
I am not running demagogy, but IBM ships OS USB and hp servers have SD card slot inside for (not wasting too much money at) booting system

Your motherboard has software (aka FAKE) RAID, so it is better to enable it in Linux (no difference, RST windows driver reads RAID config from BIOS, nobody gets hurt if Linux reads it from the last sector of the disk)
https://help.ubuntu.com/community/FakeRaidHowto
0
 
LVL 117

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE) earned 0 total points
Comment Utility
We've established that the drivers in Ubuntu 13.10 are not compatible with the SATA (IDE/AHCI/RAID) controller on this motherboard, no SSD, or conventional SATA disk drives worked correctly.

We found, if a single SATA was connected to the BUS, Ubuntu 13.10 would not detect any SATA device to install.

We have removed the (2) SSD, and installed Ubuntu 13.10 on the 1 x OWC Mercury Accelsior 1TB PCI-e  SSD.

This configuration is acceptable, although slower than 2 x 2 x SanDisk Ultra Plus SSD 256GB (SDSSDHP256G) .

As an aside, we have successfully installed

1. Ubuntu 13.10 on OWC Mercury Accelsior 1TB PCI-e  SSD.
2. Installed nVidia Drivers 331.49 for nVidia GT610 & Tesla K40c (Nvidia SMI 331.49)
3. Installed and successfully compiled CUDA 5.5.0 toolkit
4. Successfully compiled  CUDA 5.5.0 toolkit and Samples.
5. Executed various CUDA 5.5 samples against the K40c Compute card.

Here's a screen shot, of running Mandlebrot, and output from nvcc -V and nvidia-smi.

Mandlebrot, and output from nvcc -V and nvidia-smi on Ubuntu 13.10 with Tesla K40c, CUDA 5.5.0
0
 
LVL 117

Author Closing Comment

by:Andrew Hancock (VMware vExpert / EE MVE)
Comment Utility
The solution provided and accepted, is the Answer to this Question. nVidia may state that CUDA 5.5.0 is not supported on Ubuntu 13.10, but this shows it does compile and function correclty without issue.
0

Featured Post

Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Join & Write a Comment

Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now