CentOS 6.3 snapshot migrated from Virginia/East to Oregon or Ireland won't start

rick's Avatar

rick

25 Oct, 2012 10:23 PM

I have an AMI originating in Virginia region that has a snapshot, then copied/migrated to Oregon and Ireland. I then created a new AMI in each region using the migrated snapshot, but it fails to start (just hangs on "initializing..."). One thing I saw suggested was to change the root device from the default "/dev/sda1" to "/dev/sda" - no joy.

What am I missing? Shouldn't the AMI's be compatible across regions??

Thanks
Rick

  1. 1 Posted by rick on 25 Oct, 2012 10:44 PM

    rick's Avatar

    More data from System Log:

    Console: colour dummy device 80x25

    Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

    Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

    Software IO TLB disabled

    vmalloc area: e7800000-f53fe000, maxmem 2d7fe000

    Memory: 618496k/637952k available (1970k kernel code, 11016k reserved, 628k data, 156k init, 0k highmem)

    Checking if this processor honours the WP bit even in supervisor mode... Ok.

    Calibrating delay using timer specific routine.. 4537.02 BogoMIPS (lpj=22685133)

    Mount-cache hash table entries: 512

    CPU: L1 I cache: 32K, L1 D cache: 32K

    CPU: L2 cache: 256K

    CPU: L3 cache: 4096K

    Checking 'hlt' instruction... OK.

    Brought up 1 CPUs

    migration_cost=0

    Grant table initialized

    NET: Registered protocol family 16

    Brought up 1 CPUs

    xen_mem: Initialising balloon driver.

    VFS: Disk quotas dquot_6.5.1

    Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)

    Initializing Cryptographic API

    io scheduler noop registered

    io scheduler anticipatory registered (default)

    io scheduler deadline registered

    io scheduler cfq registered

    i8042.c: No controller found.

    RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize

    Xen virtual console successfully installed as tty1

    Event-channel device installed.

    netfront: Initialising virtual ethernet driver.

    mice: PS/2 mouse device common for all mice

    md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27

    md: bitmap version 4.39

    NET: Registered protocol family 2

    netfront: device eth0 has copying receive path.

    Registering block device major 8

    sda: sda1 sda2

    IP route cache hash table entries: 32768 (order: 5, 131072 bytes)

    TCP established hash table entries: 131072 (order: 8, 1048576 bytes)

    TCP bind hash table entries: 65536 (order: 7, 524288 bytes)

    TCP: Hash tables configured (established 131072 bind 65536)

    TCP reno registered

    TCP bic registered

    NET: Registered protocol family 1

    NET: Registered protocol family 17

    NET: Registered protocol family 15

    Using IPI No-Shortcut mode

    XENBUS: Device with no driver: device/console/0

    md: Autodetecting RAID arrays.

    md: autorun ...

    md: ... autorun DONE.

    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,0)

    Linux version 2.6.16-xenU (root@ip-10-204-118-8) (gcc version 4.0.2 20051125 (Red Hat 4.0.2-8)) #14 SMP Wed Nov 23 08:48:06 EST 2011

    BIOS-provided physical RAM map:

    Xen: 0000000000000000 - 0000000026f00000 (usable)

    0MB HIGHMEM available.

    623MB LOWMEM available.

    NX (Execute Disable) protection: active

    Built 1 zonelists

    Kernel command line: root=/dev/sda ro 4

    Enabling fast FPU save and restore... done.

    Enabling unmasked SIMD FPU exception support... done.

    Initializing CPU#0

    PID hash table entries: 4096 (order: 12, 65536 bytes)

    Xen reported: 2266.746 MHz processor.

    Console: colour dummy device 80x25

    Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

    Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

    Software IO TLB disabled

    vmalloc area: e7800000-f53fe000, maxmem 2d7fe000

    Memory: 618496k/637952k available (1970k kernel code, 11016k reserved, 628k data, 156k init, 0k highmem)

    Checking if this processor honours the WP bit even in supervisor mode... Ok.

    Calibrating delay using timer specific routine.. 4535.05 BogoMIPS (lpj=22675289)

    Mount-cache hash table entries: 512

    CPU: L1 I cache: 32K, L1 D cache: 32K

    CPU: L2 cache: 256K

    CPU: L3 cache: 4096K

    Checking 'hlt' instruction... OK.

    Brought up 1 CPUs

    migration_cost=0

    Grant table initialized

    NET: Registered protocol family 16

    Brought up 1 CPUs

    xen_mem: Initialising balloon driver.

    VFS: Disk quotas dquot_6.5.1

    Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)

    Initializing Cryptographic API

    io scheduler noop registered

    io scheduler anticipatory registered (default)

    io scheduler deadline registered

    io scheduler cfq registered

    i8042.c: No controller found.

    RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize

    Xen virtual console successfully installed as tty1

    Event-channel device installed.

    netfront: Initialising virtual ethernet driver.

    mice: PS/2 mouse device common for all mice

    md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27

    md: bitmap version 4.39

    NET: Registered protocol family 2

    netfront: device eth0 has copying receive path.

    Registering block device major 8

    sda: sda1 sda2

    IP route cache hash table entries: 32768 (order: 5, 131072 bytes)

    TCP established hash table entries: 131072 (order: 8, 1048576 bytes)

    TCP bind hash table entries: 65536 (order: 7, 524288 bytes)

    TCP: Hash tables configured (established 131072 bind 65536)

    TCP reno registered

    TCP bic registered

    NET: Registered protocol family 1

    NET: Registered protocol family 17

    NET: Registered protocol family 15

    Using IPI No-Shortcut mode

    XENBUS: Device with no driver: device/console/0

    md: Autodetecting RAID arrays.

    md: autorun ...

    md: ... autorun DONE.

    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,0)

  2. Support Staff 2 Posted by Ylastic on 25 Oct, 2012 11:14 PM

    Ylastic's Avatar

    That sounds like maybe an issue with the kernel being used. Is this a PVGrub AMi that uses its own custom kernel?

  3. 3 Posted by rick on 25 Oct, 2012 11:20 PM

    rick's Avatar

    Yes. It was created from AMI: CentOS-6.3-x86_64-cloudinit (ami-f95cf390)

  4. Support Staff 4 Posted by Ylastic on 25 Oct, 2012 11:22 PM

    Ylastic's Avatar

    Sorry we dont have support currently for migrating pvgrub AMis. The migration dialog mentions this.

  5. 5 Posted by rick on 25 Oct, 2012 11:25 PM

    rick's Avatar

    So when I use Ylastic to copy a snapshot, then create an AMI it's not going to work?

    How about for "standard" EC2 Linux kernels like Amazon Linux and Red Hat?

  6. Support Staff 6 Posted by Ylastic on 25 Oct, 2012 11:28 PM

    Ylastic's Avatar

    Create AMi needs to register with a kernel or use a default EC2 kernel. With a pvgrub AMI the defaults usually will not work. One workaround you can try is to see if the AMI will launch with a pvgrub kernel from the target region.

    Sorry, EC2 makes it really hard with all the combinations of kernels and regions.

    A standard EC2 kernel will usually not work for a pvgrub AMI.

  7. 7 Posted by rick on 25 Oct, 2012 11:53 PM

    rick's Avatar

    I found compatible kernels in my target regions... still unsure if this is going to work.

    It should NOT be this freakin' hard to copy and run an image across regions!

  8. Support Staff 8 Posted by Ylastic on 25 Oct, 2012 11:57 PM

    Ylastic's Avatar

    Please let me know if the compatible kernel in the target region works for you.

    We have seen all kinds of migrations in EC2, and it can sometimes be quite painful, as there are all kinds of little things EC2 does to make it just frustrating enough.

  9. 9 Posted by rick on 26 Oct, 2012 12:13 AM

    rick's Avatar

    I am now able to get the AMI and an instance launched with a compatible kernel, and getting a 2/2 status.

    However, my PHP and javascript files are all zero length! That tells me the Ylastic snapshot data copy process didn't work as expected.

    Can someone please assist me with this? The only reason I signed up with Ylastic was for copying my AMI's across regions - and it's not working.

  10. 10 Posted by rick on 26 Oct, 2012 12:20 AM

    rick's Avatar

    What's really strange is, looks like it's just the files in my product file area that are zero length. In fact, ALL files in that entire tree are zero length (the proper filenames are there, but they're all empty).

    Should I just re-try the snapshot migration?

    "root" owns the files in that tree. I wonder if ec2-user can't read open them for reading in the original image...

  11. 11 Posted by rick on 26 Oct, 2012 12:30 AM

    rick's Avatar

    I am re-copying / migrating the original snapshot and we will see how it goes...

  12. Support Staff 12 Posted by Ylastic on 26 Oct, 2012 12:38 AM

    Ylastic's Avatar

    Hmm. That sounds like some kind of a perms issue. We rsync the files, and rsync should usually not miss files.

  13. 13 Posted by rick on 26 Oct, 2012 01:20 AM

    rick's Avatar

    Looks like there's some issue with the migration that I just ran - will try it again:

    Description
    From
    To
    Status
    Time taken
    Started
    Migrate snapshot [snap-e06ec697]
    E W2

    fingerprint 50:21:ff:07:70:7a:83:5c:38:08:86:8a:21:8d:05:53 does not match for "ec2-184-73-9-236.compute-1.amazonaws.com,10.210.166.214"
    0h 9m
    Thu Oct 25 20:29:57 -0400 2012
    Migrate snapshot [snap-e06ec697]
    E W2

    Created new snapshot [snap-078a3121]
    0h 30m
    Thu Oct 25 00:06:01 -0400 2012

  14. 14 Posted by rick on 26 Oct, 2012 02:14 AM

    rick's Avatar

    I created a new AMI and snapshot from the original source instance in Virginia/East region. Then used Ylastic to copy / migrate that snapshot to Oregon/West 2 - same issue. All of the files in my product tree are zero length.

    I can probably just tar them up and move them across, but I need to know why this is happening and how to fix it, so I can trust the region-to-region copies.

    Any suggestions on how we can gt to the bottom of this?

  15. 15 Posted by rick on 26 Oct, 2012 02:35 AM

    rick's Avatar

    Tar'd the files in the one tree up and moved them across. I noticed that ec2-user did not have permission to write to the particular target directory (my SFTP failed when initially uploading the tar.gz file).

    Can you tell me what user ID RSYNC runs as when copying the files? My app still doesn't work on the copied/migrated snapshot, so it's clear not everything is coming across as expected.

    Any other ideas what could be causing the copy/migration to be incomplete?

  16. 16 Posted by rick on 26 Oct, 2012 03:20 AM

    rick's Avatar

    How does one go about opening a Support ticket with Ylastic?

  17. Support Staff 17 Posted by Ylastic on 26 Oct, 2012 11:43 AM

    Ylastic's Avatar

    We have only forum based support currently.

    Rsync runs using root. Will need to look more into this as to why on this specific tree rsync would not copy all the files across. Are there any other special bits set on the files that are not being rsynced across?

    thanks!

  18. Support Staff 18 Posted by Ylastic on 26 Oct, 2012 12:01 PM

    Ylastic's Avatar

    Here is the rsync command we use along with parameters in there:

    sudo -E rsync -WPazSHAX --rsh='ssh -o stricthostkeychecking=no -i #{keyfile}' --rsync-path 'sudo rsync' #{source_path} #{sshuser}@#{dest_ip}:#{dest_path}

  19. 19 Posted by rick on 26 Oct, 2012 12:48 PM

    rick's Avatar

    No special bits on the /var/www tree - it's just owned by root and mostly chmod 755.

    The files and directories are all there, but they are all zero-length (as if an error occurred when either reading or writing each one)

  20. Support Staff 20 Posted by Ylastic on 26 Oct, 2012 12:50 PM

    Ylastic's Avatar

    ok. I will look into this more and see if we are able to find out why rsync is having an issue with this. 755 should be just fine as far as perms go.

  21. 21 Posted by rick on 26 Oct, 2012 12:59 PM

    rick's Avatar

    If you need to use my account to reproduce the issue, just let me know.
    Thank you.

  22. Support Staff 22 Posted by Ylastic on 26 Oct, 2012 01:00 PM

    Ylastic's Avatar

    Sorry we cannot use customer accounts for anything.

    Will try to reproduce this on our end and let you know once i have more info.

    thanks

  23. 23 Posted by rick on 27 Oct, 2012 12:04 AM

    rick's Avatar

    Thanks

  24. 24 Posted by Jeff on 30 Oct, 2012 06:15 PM

    Jeff's Avatar

    Did you ever find root cause on this? We have signed up for Ylastic to do this and are considering cancelling if this doesn't work.

  25. 25 Posted by rick on 30 Oct, 2012 06:39 PM

    rick's Avatar

    Hi Jeff,

    Ylastic did not resolve this for me, so I cancelled my account. I ended up getting good support (for free) and a working solution from Cloudy Scripts.

    https://cloudyscripts.com/tool/show/13

    The MS Windows snapshot copy works for my CentOS 6.3 image (apparently because of the "partitions" contained in the image, according to the CloudScripts author).

    Hope that helps. Good luck!
    Rick

  26. Support Staff 26 Posted by Ylastic on 30 Oct, 2012 06:46 PM

    Ylastic's Avatar

    Sorry we do not have support for Centos 6.3 as I mentioned earlier. Also multi partitioned volumes are not currently supported.

  27. 27 Posted by rick on 30 Oct, 2012 06:59 PM

    rick's Avatar

    No worries. We found another solution.

    Thanks.

  28. Support Staff 28 Posted by Ylastic on 05 Nov, 2012 01:42 PM

    Ylastic's Avatar

    Update on the some files zero length issue. We were able to recreate this issue on an Amzon Linux 64 bit version. Updating the version of rsync seems to have fixed it for the Amazon linux 64 bit and the zero length files copied over correctly without any issue. If you get a chance, please try the migration again and let me know if you still have an issue with the zero length files.

    thanks

  29. 29 Posted by Rick on 05 Dec, 2012 09:48 PM

    Rick's Avatar

    Okay - ready to try it again... I have asked Ylastic support for a trial account to test/verify the copied snapshot image will boot.

  30. Ylastic closed this discussion on 08 Aug, 2014 02:17 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac