Palacios VMM Release 1.3

This forums is for OS project announcements including project openings, new releases, update notices, test requests, and job openings (both paying and volunteer).
Post Reply
PatrickB
Posts: 5
Joined: Mon Nov 21, 2011 12:11 pm

Palacios VMM Release 1.3

Post by PatrickB »

Palacios 1.3 Released


The V3VEE Project at Northwestern University, the University of New Mexico, the University of Pittsburgh, Sandia National Labs, and Oak Ridge National Lab are pleased to announce the release of Palacios 1.3, a substantially enhanced version of this open-source virtual machine monitor. Palacios provides an an open substrate for virtualization research, development, use, and teaching in computer systems, computer architecture, and high performance computing.

Palacios is a virtual machine monitor (VMM) that is available for public use as a community resource. It is highly configurable and designed to be embeddable into different host operating systems, such as Linux and the Kitten lightweight kernel. Palacios is a non-paravirtualized VMM that makes extensive use of the virtualization extensions in modern Intel and AMD x86 processors. A compact codebase (<100,000 lines for 1.3), Palacios has been designed to be easy to understand and readily configurable for different environments. It is unique in being designed to be embeddable into other OSes instead of being implemented in the context of a specific OS.

Compared to 1.3, significant new functionality has been added, including support for multicore guests, support for embedding as a
Linux kernel module, the VNET/P overlay network system, simple checkpoint/restore, host devices, graphics consoles and VGA, and virtual core migration. Enhancements are present throughout the codebase. Currently, Palacios can run on commodity PC hardware, and Cray XT3/4 machines such as Red Storm.

Palacios is BSD-licensed and available from http://v3vee.org. Detailed instructions on how to download, install, build, and use Palacios are available at http://v3vee.org. The site also includes links to the relevant discussion groups. Community enhancements to Palacios are very much welcomed. The V3VEE Project is supported by the United States National Science Foundation and the Department of Energy.

--The V3VEE Team
Nable
Member
Member
Posts: 453
Joined: Tue Nov 08, 2011 11:35 am

Re: Palacios VMM Release 1.3

Post by Nable »

Does it work well on VMX PCs? I've spent much time to make previous version from repo working on my i5.
PatrickB
Posts: 5
Joined: Mon Nov 21, 2011 12:11 pm

Re: Palacios VMM Release 1.3

Post by PatrickB »

It's research code, so all of the normal caveats apply, but yes, it does work well on VMX PCs. We spent much of the last few weeks squashing a few persistent VMX bugs, particularly for multicore guests. If you are having trouble getting things working, please let us know.
Nable
Member
Member
Posts: 453
Joined: Tue Nov 08, 2011 11:35 am

Re: Palacios VMM Release 1.3

Post by Nable »

On my side i've fixed some bugs related to VMX even in September (and committed these fixes to /ispras branch in October), especially bugs related to UG (unrestricted guest) mode.
Our teamleader sent patches to your team but said that no answer came.
Do you know anything about that and if `yes' then are going to use our patches?

Also, now i'm trying to run palacios on HP ProLiant Server (ProLiant BL2x220c G7) but linux-2.6.32-71.29.1.el6.x86_64 (CentOS 6.0) fails.
Upd: shame on me, kernel failed to attach timer just because i misconfigured smth before compilation.
Now it hangs during daemons start, last message of palacios was: palacios/src/palacios/vmm_paging.c(659): 1 Gigabyte pages not supported


Upd2: hm, looks like you have your own way and the progress is very huge.
I've found what's wrong in my copy of palacios code, although it may be already fixed in 1.3.
Oh, looks like it'll be long process of merging full of headache.


Upd3: much more interesting info next Tuesday, when our teamleader and me will be again in the city

Patch for 2.6.32 ( http://mirror.centos.org/centos/6/updat ... l6.src.rpm ) to run under palacios (some unrelated parts were changed because compiler didn't like those lines) :

Code: Select all

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 81fb1b0..7b73747 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -232,6 +232,10 @@ source "init/Kconfig"
 source "kernel/Kconfig.freezer"
 
 menu "Processor type and features"
+config PALACIOS
+	bool "Palacios support"
+	help
+		No help.
 
 source "kernel/time/Kconfig"
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 631958a..42a0969 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -99,6 +99,7 @@ obj-$(CONFIG_KVM_CLOCK)		+= kvmclock.o
 obj-$(CONFIG_PARAVIRT)		+= paravirt.o paravirt_patch_$(BITS).o
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
 obj-$(CONFIG_PARAVIRT_CLOCK)	+= pvclock.o
+obj-$(CONFIG_PALACIOS)		+= palacios.o
 
 obj-$(CONFIG_PCSPKR_PLATFORM)	+= pcspeaker.o
 
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 5acdbc7..f46b207 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1303,7 +1303,7 @@ ENTRY(xen_do_hypervisor_callback)   # do_hypervisor_callback(struct *pt_regs)
 	decl PER_CPU_VAR(irq_count)
 	jmp  error_exit
 	CFI_ENDPROC
-END(do_hypervisor_callback)
+END(xen_do_hypervisor_callback)
 
 /*
  * Hypervisor uses this for application faults while it executes.
diff --git a/arch/x86/kernel/palacios.c b/arch/x86/kernel/palacios.c
new file mode 100644
index 0000000..b4f5cbf
--- /dev/null
+++ b/arch/x86/kernel/palacios.c
@@ -0,0 +1,132 @@
+/*
+ * palacios.c
+ *
+ *  Created on: Jul 12, 2011
+ *      Author: vedun
+ */
+
+
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/module.h>
+#include <linux/gfp.h>
+#include <asm/msr.h>
+#include <palacios/palacios.h>
+#include <asm/io.h>
+
+#define NO_SYM_HV 0
+#define SYM_HV_VMX 1
+#define SYM_HV_SVM 2
+#define SYM_PAGE_SIZE 12
+#define SYM_CPUID_NUM 0x90000000
+#define MEM_OFFSET_HCALL 0x1000
+
+#define SYM_MSR_GLOBAL 0x00000534
+
+static struct v3_symspy_global_page {
+    uint64_t magic;
+
+    union {
+	uint32_t feature_flags;
+	struct {
+	    uint8_t pci_map_valid      : 1;
+	    uint8_t symmod_enabled     : 1;
+	    uint8_t sec_symmod_enabled : 1;
+	} __attribute__((packed));
+    } __attribute__((packed));
+
+    uint8_t pci_pt_map[(4 * 256) / 8]; // we're hardcoding this: (4 busses, 256 max devs)
+
+} __attribute__((packed)) *symspy_global_page;
+
+
+static int symspy_is_initialized = 0;
+static int vm_is_detected = 0;
+static unsigned long long mem_offset = 0;
+
+static int detect_sym_hv(void) {
+    unsigned int eax = 0, ebx = 0;
+    printk("Detecting symbiotic hypervisor..\n");
+
+    asm volatile(
+        "cpuid;" :"=a"(eax),"=b"(ebx):"a"((unsigned int)SYM_CPUID_NUM)
+    );
+
+    if(eax == *(unsigned int*)"V3V") {
+        printk("V3VEE detected: arch %s.\n", (char*)&ebx);
+
+        if(ebx == *(unsigned int*)"SVM")
+            return SYM_HV_SVM;
+        else if(ebx == *(unsigned int*)"VMX")
+            return SYM_HV_VMX;
+        else {
+            printk("Bad signature!\n");
+            return NO_SYM_HV;
+        }
+    }
+
+    printk("V3VEE not detected. EAX %x EBX %x\n", eax, ebx);
+
+    return NO_SYM_HV;
+}
+
+static int symbiotic_test(void) {
+    int detect = 0;
+    void* vaddr;
+    dma_addr_t paddr;
+
+    printk("SYMBIOTIC TEST START\n");
+    if((detect = detect_sym_hv()) != NO_SYM_HV) {
+        int status = 0;
+        if(detect == SYM_HV_SVM) {
+            asm volatile(
+                "vmmcall;"
+                :"=a"(status), "=b"(mem_offset):"a"(MEM_OFFSET_HCALL)
+            );
+        } else {
+            asm volatile(
+                "vmcall;"
+                :"=a"(status), "=b"(mem_offset):"a"(MEM_OFFSET_HCALL)
+            );
+        }
+        if(status != 0) {
+            printk("Hypercall finished with error.\n");
+        } else {
+            printk("Detected memory offset %llx.\n", mem_offset);
+        }
+
+        vaddr = (void *) __get_free_page(GFP_KERNEL);
+        paddr = virt_to_phys(vaddr);
+        //unsigned long long value = paddr;
+        wrmsr(SYM_MSR_GLOBAL, paddr & 0xFFFFFFFF, paddr >> 32);
+        symspy_global_page = vaddr;
+
+        printk("SymspyGlobalPage detected at VA %LX, PA %LX\n", (long long)vaddr, (long long)paddr);
+
+        return 1;
+    }
+    return 0;
+}
+
+
+uint64_t palacios_get_device_dma_offset(int bus, int dev, int func) {
+	if (bus >= 4)
+		return 0;
+
+	if (!symspy_is_initialized) {
+		vm_is_detected = detect_sym_hv();
+		symbiotic_test();
+		symspy_is_initialized = 1;
+	}
+
+	if (vm_is_detected) {
+	    int dev_index = (bus << 8) + (dev << 3) + func;
+	    int major = dev_index / 8;
+	    int minor = dev_index % 8;
+		return ((symspy_global_page->pci_pt_map[major] & (1 << minor))  == 0) ? 0 : mem_offset;
+	} else {
+		return 0;
+	}
+}
+
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 6ac3931..efbc143 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -169,7 +169,10 @@ again:
 		return NULL;
 	}
 
-	*dma_addr = addr;
+	*dma_addr = addr + dev->dma_offset;
+	if (dev->dma_offset != 0) {
+		printk("Alloc %lX %lX : %lX\n", (long unsigned)*dma_addr , (long unsigned)(*dma_addr - dev->dma_offset), (long unsigned)page);
+	}
 	return page_address(page);
 }
 
diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index a3933d4..b738fef 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -30,7 +30,8 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
 				 enum dma_data_direction dir,
 				 struct dma_attrs *attrs)
 {
-	dma_addr_t bus = page_to_phys(page) + offset;
+	dma_addr_t bus = page_to_phys(page) + offset + dev->dma_offset;
+	printk("map_page: %lX %lX : %lX\n", (long unsigned)bus, (long unsigned)(bus - dev->dma_offset), (long unsigned)page);
 	WARN_ON(size == 0);
 	if (!check_addr("map_single", dev, bus, size))
 		return bad_dma_address;
@@ -64,7 +65,7 @@ static int nommu_map_sg(struct device *hwdev, struct scatterlist *sg,
 
 	for_each_sg(sg, s, nents, i) {
 		BUG_ON(!sg_page(s));
-		s->dma_address = sg_phys(s);
+		s->dma_address = sg_phys(s) + hwdev->dma_offset;
 		if (!check_addr("map_sg", hwdev, s->dma_address, s->length))
 			return 0;
 		s->dma_length = s->length;
diff --git a/arch/x86/kernel/tlb_uv.c b/arch/x86/kernel/tlb_uv.c
index c34dca8..08e5e33 100644
--- a/arch/x86/kernel/tlb_uv.c
+++ b/arch/x86/kernel/tlb_uv.c
@@ -1491,7 +1491,7 @@ static void uv_init_per_cpu(int nuvhubs)
 	int uvhub;
 	short socket = 0;
 	unsigned short socket_mask;
-	unsigned int uvhub_mask;
+	unsigned int uvhub_mask = 0;
 	struct bau_control *bcp;
 	struct uvhub_desc *bdp;
 	struct socket_desc *sdp;
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 35236aa..9b63cba 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -3225,7 +3225,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	VPRINTK("ENTER\n");
 
-	WARN_ON(ATA_MAX_QUEUE > AHCI_MAX_CMDS);
+	//WARN_ON(ATA_MAX_QUEUE > AHCI_MAX_CMDS);
 
 	if (!printed_version++)
 		dev_printk(KERN_DEBUG, &pdev->dev, "version " DRV_VERSION "\n");
diff --git a/drivers/base/core.c b/drivers/base/core.c
index fab9f76..f4c9e60 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -557,6 +557,7 @@ static void klist_children_put(struct klist_node *n)
  */
 void device_initialize(struct device *dev)
 {
+	dev->dma_offset = 0;
 	dev->kobj.kset = devices_kset;
 	kobject_init(&dev->kobj, &device_ktype);
 	INIT_LIST_HEAD(&dev->dma_pools);
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 1eef267..a6044ed 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -473,7 +473,7 @@ static ssize_t dev_show_unique_id(struct device *dev,
 {
 	drive_info_struct *drv = to_drv(dev);
 	struct ctlr_info *h = to_hba(drv->dev.parent);
-	__u8 sn[16];
+	__u8 sn[16] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
 	unsigned long flags;
 	int ret = 0;
 
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index cef28a7..b2046e4 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -17,6 +17,10 @@
 
 #include "pci.h"
 
+#ifdef CONFIG_PALACIOS
+#include <palacios/palacios.h>
+#endif
+
 /**
  * pci_bus_alloc_resource - allocate a resource from a parent bus
  * @bus: PCI bus
@@ -92,6 +96,13 @@ int pci_bus_add_device(struct pci_dev *dev)
 	dev->is_added = 1;
 	pci_proc_attach_device(dev);
 	pci_create_sysfs_dev_files(dev);
+
+#ifdef CONFIG_PALACIOS
+	dev->dev.dma_offset = palacios_get_device_dma_offset
+		(dev->bus->number, dev->devfn >> 3, dev->devfn & 0x7);
+	printk("Palacios for %d:%d : DMA Offset is %lX\n", dev->bus->number, dev->devfn, (unsigned long)dev->dev.dma_offset);
+#endif
+
 	return 0;
 }
 
diff --git a/fs/compat.c b/fs/compat.c
index dc7853a..c8fb2f3 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -15,6 +15,7 @@
  *  published by the Free Software Foundation.
  */
 
+#include <linux/stddef.h>
 #include <linux/kernel.h>
 #include <linux/linkage.h>
 #include <linux/compat.h>
@@ -817,8 +818,6 @@ asmlinkage long compat_sys_mount(char __user * dev_name, char __user * dir_name,
 	return retval;
 }
 
-#define NAME_OFFSET(de) ((int) ((de)->d_name - (char __user *) (de)))
-
 struct compat_old_linux_dirent {
 	compat_ulong_t	d_ino;
 	compat_ulong_t	d_offset;
@@ -907,7 +906,7 @@ static int compat_filldir(void *__buf, const char *name, int namlen,
 	struct compat_linux_dirent __user * dirent;
 	struct compat_getdents_callback *buf = __buf;
 	compat_ulong_t d_ino;
-	int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 2, sizeof(compat_long_t));
+	int reclen = ALIGN(offsetof(struct compat_linux_dirent, d_name) + namlen + 2, sizeof(compat_long_t));
 
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
@@ -994,7 +993,7 @@ static int compat_filldir64(void * __buf, const char * name, int namlen, loff_t
 {
 	struct linux_dirent64 __user *dirent;
 	struct compat_getdents_callback64 *buf = __buf;
-	int jj = NAME_OFFSET(dirent);
+	int jj = offsetof(struct compat_linux_dirent, d_name);
 	int reclen = ALIGN(jj + namlen + 1, sizeof(u64));
 	u64 off;
 
diff --git a/fs/readdir.c b/fs/readdir.c
index 7723401..ab07ead 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 1995  Linus Torvalds
  */
 
+#include <linux/stddef.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/time.h>
@@ -54,7 +55,6 @@ EXPORT_SYMBOL(vfs_readdir);
  * anyway. Thus the special "fillonedir()" function for that
  * case (the low-level handlers don't need to care about this).
  */
-#define NAME_OFFSET(de) ((int) ((de)->d_name - (char __user *) (de)))
 
 #ifdef __ARCH_WANT_OLD_READDIR
 
@@ -152,7 +152,7 @@ static int filldir(void * __buf, const char * name, int namlen, loff_t offset,
 	struct linux_dirent __user * dirent;
 	struct getdents_callback * buf = (struct getdents_callback *) __buf;
 	unsigned long d_ino;
-	int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 2, sizeof(long));
+	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2, sizeof(long));
 
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
@@ -237,7 +237,7 @@ static int filldir64(void * __buf, const char * name, int namlen, loff_t offset,
 {
 	struct linux_dirent64 __user *dirent;
 	struct getdents_callback64 * buf = (struct getdents_callback64 *) __buf;
-	int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 1, sizeof(u64));
+	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 1, sizeof(u64));
 
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
diff --git a/include/Kbuild b/include/Kbuild
index 8d226bf..f1da0d9 100644
--- a/include/Kbuild
+++ b/include/Kbuild
@@ -10,3 +10,4 @@ header-y += video/
 header-y += drm/
 header-y += xen/
 header-y += scsi/
+header-y += palacios/
diff --git a/include/linux/device.h b/include/linux/device.h
index 2ea3e49..ad6273d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -407,6 +407,7 @@ struct device {
 					     allocations such descriptors. */
 
 	struct device_dma_parameters *dma_parms;
+	u64		dma_offset;	/* Palacios dma offset */
 
 	struct list_head	dma_pools;	/* dma pools (if dma'ble) */
 
diff --git a/include/palacios/Kbuild b/include/palacios/Kbuild
new file mode 100644
index 0000000..4bed9d9
--- /dev/null
+++ b/include/palacios/Kbuild
@@ -0,0 +1 @@
+header-y += palacios.h
\ No newline at end of file
diff --git a/include/palacios/palacios.h b/include/palacios/palacios.h
new file mode 100644
index 0000000..b7077a1
--- /dev/null
+++ b/include/palacios/palacios.h
@@ -0,0 +1,13 @@
+/*
+ * palacios.h
+ *
+ *  Created on: Jul 12, 2011
+ *      Author: vedun
+ */
+
+#ifndef __PALACIOS_H
+#define __PALACIOS_H
+
+uint64_t palacios_get_device_dma_offset(int bus, int dev, int func);
+
+#endif /* __PALACIOS_H */
diff --git a/init/main.c b/init/main.c
index 1a9af60..8c8bc23 100644
--- a/init/main.c
+++ b/init/main.c
@@ -744,11 +744,10 @@ int do_one_initcall(initcall_t fn)
 		calltime = ktime_get();
 		trace_boot_call(&call, fn);
 		enable_boot_trace();
-	}
+	
 
-	ret.result = fn();
+		ret.result = fn();
 
-	if (initcall_debug) {
 		disable_boot_trace();
 		rettime = ktime_get();
 		delta = ktime_sub(rettime, calltime);
@@ -756,6 +755,8 @@ int do_one_initcall(initcall_t fn)
 		trace_boot_ret(&ret, fn);
 		printk("initcall %pF returned %d after %Ld usecs\n", fn,
 			ret.result, ret.duration);
+	} else {
+		ret.result = fn();
 	}
 
 	msgbuf[0] = 0;
diff --git a/kernel/async.c b/kernel/async.c
index 27235f5..393e033 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -284,17 +284,17 @@ void async_synchronize_cookie_domain(async_cookie_t cookie,
 	if (initcall_debug && system_state == SYSTEM_BOOTING) {
 		printk("async_waiting @ %i\n", task_pid_nr(current));
 		starttime = ktime_get();
-	}
 
-	wait_event(async_done, lowest_in_progress(running) >= cookie);
+		wait_event(async_done, lowest_in_progress(running) >= cookie);
 
-	if (initcall_debug && system_state == SYSTEM_BOOTING) {
 		endtime = ktime_get();
 		delta = ktime_sub(endtime, starttime);
 
 		printk("async_continuing @ %i after %lli usec\n",
 			task_pid_nr(current),
 			(long long)ktime_to_ns(delta) >> 10);
+	} else {
+		wait_event(async_done, lowest_in_progress(running) >= cookie);
 	}
 }
 EXPORT_SYMBOL_GPL(async_synchronize_cookie_domain);
diff --git a/lib/iomap.c b/lib/iomap.c
index d322293..bd32c25 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -258,20 +258,27 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
 	resource_size_t start = pci_resource_start(dev, bar);
 	resource_size_t len = pci_resource_len(dev, bar);
 	unsigned long flags = pci_resource_flags(dev, bar);
+	void __iomem *ret = NULL;
 
 	if (!len || !start)
 		return NULL;
 	if (maxlen && len > maxlen)
 		len = maxlen;
-	if (flags & IORESOURCE_IO)
-		return ioport_map(start, len);
+	if (flags & IORESOURCE_IO) {
+		ret = ioport_map(start, len);
+		goto end;
+	}
 	if (flags & IORESOURCE_MEM) {
-		if (flags & IORESOURCE_CACHEABLE)
-			return ioremap(start, len);
-		return ioremap_nocache(start, len);
+		if (flags & IORESOURCE_CACHEABLE) {
+			ret = ioremap(start, len);
+			goto end;
+		}
+		ret = ioremap_nocache(start, len);
 	}
+end:
+	printk("DEBUG : Mapping %lX..%lX to %lX\n", (unsigned long)start, (unsigned long)len, (unsigned long)ret);
 	/* What? */
-	return NULL;
+	return ret;
 }
 
 void pci_iounmap(struct pci_dev *dev, void __iomem * addr)
[/strike]
As far as i find more information, i'll update this post.
PatrickB
Posts: 5
Joined: Mon Nov 21, 2011 12:11 pm

Re: Palacios VMM Release 1.3

Post by PatrickB »

Thanks for the pointers. I only check in here every few days, so this probably isn't the best place for active development discussions. I'd suggest moving this discussion to the Palacios mailing list so that more people on our team (and others) can actively follow it. We're also trying to get more and better collaborative development tools up (bugzilla, etc.), so talking more about how to do that on [email protected] would probably be a good idea.
Nable
Member
Member
Posts: 453
Joined: Tue Nov 08, 2011 11:35 am

Re: Palacios VMM Release 1.3

Post by Nable »

Just some words about things that we have and your release doesn't have:
1) telemetry is still not correctly handled: it uses AMD variant of exitcode2str (yes, this is not a function name but description) for all architectures and this is not correct at all. I have a nice short patch, it also replaces the definition of AMD exit codes from `#define' to `enum' style (it's much better, and i'm wondering why for intel you have from the very beginning more correct variant and for amd these tons of defines).
2) you don't have MSI-X support, so i couldn't test your release last Wednesday. MSI support is very important nowadays, especially for HPC (high performance computing) tasks.
3) some fixes for small problems that you find after attempt to compile with debugging support.
P.S> The more i'm studying vmx code of palacios, the more dominates the idea of porting kvm to KittenOS or kvm_intel (and may be kvm_amd) to palacios. Some parts of the code really needs to be rewritten from scratch to become understandable (and also to support fully the functions they must do (when i'm talking about such facts i especially remember function that handles mov to CR0, this piece of code that don't support many kinds of transitions and is more unreadable than variant from kvm) ) or to be copied from the place where they are good even now. I don't want to offend anybody, it's just about the ideas of further development.
Oh, don't care about my lyrical digressions.

OK, i'm going to try sending patches and other info to your mailing list ( [email protected] is the list address, am i right? )
PatrickB
Posts: 5
Joined: Mon Nov 21, 2011 12:11 pm

Re: Palacios VMM Release 1.3

Post by PatrickB »

Nable wrote:Just some words about things that we have and your release doesn't have:
1) telemetry is still not correctly handled: it uses AMD variant of exitcode2str (yes, this is not a function name but description) for all architectures and this is not correct at all. I have a nice short patch, it also replaces the definition of AMD exit codes from `#define' to `enum' style (it's much better, and i'm wondering why for intel you have from the very beginning more correct variant and for amd these tons of defines).\
We're aware of that problem and that would be a useful patch, though we're also looking into a potential replacement for some of the telemetry functionality with a more general profiling interface. As for the difference between the Intel and AMD code, the AMD code is older while the Intel code is newer, and we haven't gone back to clean up the AMD code in places due to limited manpower
2) you don't have MSI-X support, so i couldn't test your release last Wednesday. MSI support is very important nowadays, especially for HPC (high performance computing) tasks.
I agree, but supporting MSI-X properly, including for virtual devices and APICs and such is quite complicated; we'd be happy to discuss the right way to integrate your work on MSI-X on the development mailing list.
3) some fixes for small problems that you find after attempt to compile with debugging support.
P.S> The more i'm studying vmx code of palacios, the more dominates the idea of porting kvm to KittenOS or kvm_intel (and may be kvm_amd) to palacios.
We're BSD-licensed, and pulling in KVM code (which is GPLed) directly would compromise that.
Some parts of the code really needs to be rewritten from scratch to become understandable (and also to support fully the functions they must do (when i'm talking about such facts i especially remember function that handles mov to CR0, this piece of code that don't support many kinds of transitions and is more unreadable than variant from kvm) ) or to be copied from the place where they are good even now. I don't want to offend anybody, it's just about the ideas of further development.

OK, i'm going to try sending patches and other info to your mailing list ( [email protected] is the list address, am i right? )
That's the right address, though you'll need to join the mailing list using our website to be able to post to it. Simple patches like code cleanup, new virtual devices, or well-contained enhancements feel free to send along and discuss integrating them with us. More complex patches that (like full MSI-X support or changing CR0 handling) you'll want to discuss with us before you do substantial work or submit major patches. Such code impacts a lot of Palacios (e.g. MSI-X support relates closely to the APIC/IO-APIC virtual devices), and if we haven't planned for your changes, they're more likely to conflict with other changes we've made and not be accepted. If you discuss with us what you're doing in a particular direction, however, we'd be happy to plan out details so that it is easy to integrate your improvements into our codebase.
Nable
Member
Member
Posts: 453
Joined: Tue Nov 08, 2011 11:35 am

Re: Palacios VMM Release 1.3

Post by Nable »

http://v3vee.org/ -> http://www.v3vee.org/palacios/ -> Open Discussion Group for Developers (http://groups.google.com/group/v3vee-development) -> http://groups.google.com/group/v3vee-de ... pics?gvc=2
Last message has "Jun 29" as date. Is this group really alive?

P.S. I've tested palacios-1.3 today on i5. But before i've read your code for unrestricted guest, remembered everything that i've read from the documentation and the only thing sounded in head "It cannot work, i even know when it will fail". And, suddenly, i found that it really failed, just as i expected, VMEXIT_TRIPLE_FAULT. I know why it fails but i'll write more only when i'll get my version working on ProLiant (it's a pitty that in 1.3 it's still not fixed - functions v3_drill_guest_pt_64 and pdpe64_lookup don't support 1Gb pages, so its my current aim to implement it).
PatrickB
Posts: 5
Joined: Mon Nov 21, 2011 12:11 pm

Re: Palacios VMM Release 1.3

Post by PatrickB »

Nable wrote:http://v3vee.org/ -> http://www.v3vee.org/palacios/ -> Open Discussion Group for Developers (http://groups.google.com/group/v3vee-development) -> http://groups.google.com/group/v3vee-de ... pics?gvc=2
Last message has "Jun 29" as date. Is this group really alive?
Yes - most all of the developers are on it, but we have a separate internal developers list and conference calls for the core team where most of the traffic leading up to the release has been. We'd like to get more people involved from the general community and the open discussion list, however, and that list is the appropriate place for these discussions.
Post Reply