Coder Social home page Coder Social logo

lis-test's Introduction

LIS-Test-Automation

Automation platform for testing Linux images on Microsoft Hyper-V

Overview

The LIS-Test repository contains the test infrastructure for testing the Linux Integrated Services (LIS) drivers on Linux virtual machines running under Hyper-V.

The LIS-Test repository includes the test scripts to verify correct behavior of the LIS drivers, as well as the PowerShell scripts that run the test scripts.

Prerequisite

  1. You must have a Windows Machine with PowerShell. Accepted platforms:

    •	Windows 7x64
    •	Windows 8x64
    •	Windows 10x64
    •	Server 2008 R2
    •	Server 2012
    •	Server 2012 R2
    •	Server 2016
    

Download Latest Automation Code

  1. Checkout from https://github.com/LIS/lis-test.git

Download Public Utilities

Download Putty executables from http://www.putty.org and keep them in the .\bin folder. You should have the following utilities:

    •	plink.exe
    •	pscp.exe
    •	putty.exe
    •	puttygen.exe

Additional details can be found in the Quick Start Guide and the LIS Users Guide, please visit https://github.com/LIS/lis-test/tree/master/WS2012R2/lisa/docs

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

lis-test's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lis-test's Issues

Disabling DM ( pre-setup ) does not work for numa cases

I met some problem recently. My ICABase checkpoint was with DM Enabled. When running numa cases ( eg. CheckNuma ), DM_DISABLE.ps1 is invoked, but it didn't actually work. Thus the the case failed.

I did a little debugging. I looks like the problem lie in DM_CONFIGURE_MEMORY.ps1: line 233.
No memory configuration will be done unless all the four parameters are present ( vmName, tpEnabled, tPstartupMem, tPmemWeight ).
While in CheckNuma case, the tPstartMemory param is not set in xml files. The memory configure part of the script is totally skipped ( and won't leave any warning message ).

I tried to do something about the code, but it looks tricky to me:

One approach is to simply add startupMem param in all the cases invoking DM_DISABLE.ps1, which I do not think is a good way to address this problem since you would have to hard code memory size in xml files for all cases requiring DM disable ( or even more cases ). And it is also not really reasonable to make startupMem a required param when user who just want to disable DM.

Another approach is to add an exception for the " if " in line 233 ( add an " elseif " ), which is not really safe since the script runs the whole loop when parse each param. Some unexpected behavior might happen in that way.

SR-IOV test cases automation

Single Root I/O Virtualization (SR-IOV) specification is a standard for a type of PCI device assignment that can share a single device to multiple virtual machines. SR-IOV improves device performance for virtual machines.

SR-IOV support is present in upstream kernel and ported to newer Linux distributions.

putty md5sum location

InstallPutty() checks for $filesum.Hash -ne $sums[ "x86/${util}" ], but the actual path in sha256sums.txt is w32, at least for me.

Could you please check FC_MultipathDetection.ps1 #301 is correct test step by Get-Disk?

Hi Chris,

Looks that https://github.com/LIS/lis-test/blob/master/WS2012R2/lisa/setupscripts/FC_MultipathDetection.ps1 #301 is getting the Fibre Channel disk in the host, it depends on host FC set up.
In the test script FC_multipath_detect.sh line #174, use this number to compare with fcDiskCount in VM, it seems unreasonable?

$fcDisks = Get-Disk | Where-Object -FilterScript {$_.BusType -Eq "Fibre Channel"}
$fcCount = $fcDisks.Length

Could you please help to check whether this line is correct step to get the expected multipath disk number? I'm not sure how to change it, so create this issue request.

Thank you so much.

make CreateVMs.ps1 compatible with WS2012

CreateVMs.ps1 doesn't currently work on WS 2012 (no R2).
Initially it fails due to the Generation param, which is not supported in this case.
Other changes might be required when full code gets executed.

KVP pool 1 read regression test

Scope is to define a test case that will write a large number of KVP records in the pool 1 file, then read them.
This could result in a crash of the kvp daemon. Issue is to be fixed in upstream.

hv-sock: Binaries does not work on server core without extra runtime package

Hi @ilenghel

We have been running hv-sock-basic case, but it failed on our hosts. I tried to run the server_on_host.exe on our hosts, and did not work. I did a little digging and found that the the binaries might be built using DLL runtime library. Most of our hosts are installed with Hyper-V Server Core without extra runtime packages, which does not have the dependencies for the binaries. Would you help check about it and if it is true, it would be very helpful if you could recompile the binaries with C++ (MT) option as Recommendation 3 in README file.

Thank you in advance!

issues in Add-VHDXForResize.ps1 and RemoveVhdxHardDisk.ps1 scripts

@Jingli1985 We've observed 2 issues in regards to the STOR tests after the most recent changes. Please fix them or revert the changes.
More tests from VHD and VHDx xml files are affected, these are some samples:

For Add-VHDXForResize.ps1:

Info : VM_name currentTest updated to VHDx_SCSI_0_1_Dynamic_Large_64TB
Info : VM_name transitioned from SystemDown to RunSetupScript
Info : VM_name - running single setup script 'SetupScripts\Add-VHDXForResize.ps1'
The variable '$controllerType' cannot be retrieved because it has not been set.
At D:\lisa\SetupScripts\Add-VHDXForResize.ps1:289 char:10
+     if ( $controllerType -eq "SCSI" )
+          ~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (controllerType:String) [], RuntimeException
    + FullyQualifiedErrorId : VariableIsUndefined

The variable '$controllerType' cannot be retrieved because it has not been set.
At D:\lisaSetupScripts\Add-VHDXForResize.ps1:313 char:64
+ ...                                 $defaultSize $vhdPath $controllerType
+                                                           ~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (controllerType:String) [], RuntimeException
    + FullyQualifiedErrorId : VariableIsUndefined

The variable '$sts' cannot be retrieved because it has not been set.
At D:\lisaSetupScripts\Add-VHDXForResize.ps1:315 char:18
+         if (-not $sts[$sts.Length-1])
+                  ~~~~
    + CategoryInfo          : InvalidOperation: (sts:String) [], RuntimeException
    + FullyQualifiedErrorId : VariableIsUndefined

For RemoveVhdxHardDisk.ps1:

Info : VM_name running cleanup script setupScripts\RemoveVhdxHardDisk.ps1 for test VHDx_4k_HotADD_Multi_Dynamic_SCSI
The variable '$SCSICount' cannot be retrieved because it has not been set.
At D:\lisa\setupScripts\RemoveVhdxHardDisk.ps1:313 char:28
+     "SCSI"  { $SCSICount = $SCSICount +1 }
+                            ~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (SCSICount:String) [], RuntimeException
    + FullyQualifiedErrorId : VariableIsUndefined

The variable '$SCSICount' cannot be retrieved because it has not been set.
At D:\lisasetupScripts\RemoveVhdxHardDisk.ps1:313 char:28
+     "SCSI"  { $SCSICount = $SCSICount +1 }
+                            ~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (SCSICount:String) [], RuntimeException
    + FullyQualifiedErrorId : VariableIsUndefined

IQN testparam in STOR_VSS_ISCSI_PartitionDisks.sh

Improvement: if the IQN param is specified in the XML, do not use iscsi discover but connect automatically to the target.

Right now the script connects to the first IQN it discovers. On some environments, this may contain important data, which would be deleted by the script.

Lisa exits upon "unsuccessful boot for test"

At random tests during a test run if the VM does not boot or it doesn't get an IP, lisa will exit and not continue with the remaining tests.

This behavior must be changed to have a timeout then force a reboot on the VM.
If this doesn't work, then we must continue with the next test case.

Remove the need for static MAC addresses on external interfaces

@adriansuhov Several network tests make use of static MAC addresses on the External interface.
These must be converted to use of the random MAC generation interface - improvements to be done to that function.

A list of tests that still make use of static MAC:

  1. StaticMAC
  2. CopyLargeFile
  3. CopyFileDifferentMTU
  4. CopyBothWays

Kdump_Execute.sh: Critical issue for check kdump service status in RHEL

Kdump_Execute.sh:

  1. issue code:
    --
    systemctl status kdump.service | grep -q "active"
    ========>In RHEL7.X, kdump status "Active: active" / "Active: inactive", so $? always is 0
      | if [ $? -ne 0 ]; then
      | service kdump status | grep "operational"
    ========>In RHEL6.X kdump status "operational" / "not operational", so $? always is 0
      | if [ $? -eq 0 ]; then
      | LogMsg "Kdump is active after reboot"
      | echo "Success: kdump service is active after reboot." >> ~/summary.log
      | else
      | LogMsg "ERROR: kdump service is not active after reboot!"
      | echo "ERROR: kdump service is not active after reboot!" >> ~/summary.log
      | UpdateTestState "TestAborted"
      | exit 1
      | fi
      | else
      | LogMsg "Kdump is active after reboot"
      | echo "Success: kdump service is active after reboot." >> ~/summary.log
      | fi

  2. issue coe:
    ========>In cases.xml utils.sh is pushed to vm, after source it, we can use $DISTRO to check Red5., Red6., Red7.* and Ubuntu13*, Ubuntu14*, after get them, here, we use redhat* and ubuntu* to match
    ========>In cases.xml utils.sh is pushed to vm, after source it, we can use LogMsg directly
    --
    LogMsg()
      | {
      | # To add the time-stamp to the log file
      | echo date "+%a %b %d %T %Y" ": ${1}"
      | }
      |  
      | UpdateTestState()
      | {
      | echo $1 >> ~/state.txt
      | }
      |  
      | #######################################################################
      | #
      | # LinuxRelease()
      | #
      | #######################################################################
      | LinuxRelease()
      | {
      | DISTRO=grep -ihs "buntu\|Suse\|Fedora\|Debian\|CentOS\|Red Hat Enterprise Linux" /etc/{issue,*release,*version}
      |  
      | case $DISTRO in
      | buntu)
      | echo "UBUNTU";;
      | Fedora*)
      | echo "FEDORA";;
      | CentOS*)
      | echo "CENTOS";;
      | SUSE)
      | echo "SLES";;
      | RedHat*)
      | echo "RHEL";;
      | Debian*)
      | echo "DEBIAN";;
      | esac
      | }

Kdump_Results.sh: vm2ipv4 is null

Kdump_Results.sh:

Code:
case $DISTRO in
if [ $vm2ipv4 != "" ]; then
If $vm2ipv4 is not set, now above line will become if [ !=""]; then, this is a syntax error;

[feature] lisa flag to stop run on VM crash

Through a LISA flag, a test run will exit immediately and leave the VM in the current state.
This is useful to troubleshoot VM failures in case of call traces and other situations.

STOR_VSS_BackupRestore_DiskStress.ps1 could not run pass on Hyper-v 2012R2

There are three issues during running STOR_VSS_BackupRestore related test cases, please refer to below comments. (mainly test on 2012R2 host)

  1. STOR_VSS_BackupRestore_DiskStress
    when execute "while true ; do ./iozone -ag 10G ; done > /dev/null 2>&1 & " , then do backup from host, the backup cannot finish even after more than 1 hour, the case will be aborted. Test VM has 2G memory and 2 CPU on Hyper-v 2012 R2 host.

But it can run pass on 2016 host, in order to fix the issue on 2012R2, either reduce the stress as ./iozone -ag 4G , (remove the loop) for both Hyper-v 2012R2 and 2016, or only lighten the stress for 2012R2 host.

  1. STOR_VSS_BackupRestore_Fail

$EventLog= Get-WinEvent - ProviderName Microsoft-Windows-Hyper-V-VMMS , looks that this command is updated, get error No provider name as Microsoft-Windows-Hyper-V-VMMS, when use Get-WinEvent only, cannot find warning ID 10107 or 10150.

Also after fsfreeze -f $MountName, it can do backup successfully, in STOR_VSS_BackupRestore_Fail, firstly check $sts = startBackup $vmName $driveletter, it expects backup successfully, in the comments, it mentions "Backu should fail". From test case step: "3. Change host side configuration to insert failure in backup.  Or disable VSS daemon in guest.", a little different with test case description. Note, in rhel7.4, if disable VSS daemon, it could do offline backup.

  1. STOR_VSS_3Chain_VHD_Backup also cannot run pass locally, based on my last time debug, the newly created childvhd vm fails to do backup, but needs to debug further.

File issues here to request help since not sure the exact fix method, can you run pass in your side?
Could you please help to take look these three issues? Thank you so much.

f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault error

Hi Chris,

When run the latest lis-test, we get following errors during running, possibly related with following commit. https://github.com/LIS/lis-test/blob/master/WS2012R2/lisa/utilFunctions.ps1 Stop-VM $($vm.vmName) -ComputerName $($vm.hvServer) -Force -ErrorAction SilentlyContinue, when use the pull request https://github.com/LIS/lis-test/commit/13c3dffb19d2bf053ab8b1a9b4b7779e15bded14#diff-5a94f3945471067030ff7a7e991e966b.

We use the older script, it can run smoothly, but I haven't found why this commit inducing the error yet, powershell version is already 5.1, and we did not change the host.

09:19:50 Info : RHEL-7.5--GEN2-A Over-riding default snapshotName from global section to ICABase
09:19:51 Info : RHEL-7.5-.0-x86_64-GEN2-A is being reset to snapshot ICABase
09:19:51 Processing data for a remote command failed with the following error message:
09:19:51 <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault"
09:19:51 Code="3221225477" Machine="2016-AUTO"><f:Message><f:ProviderFault
09:19:51 provider="microsoft.powershell" path="C:\Windows\system32\pwrshplugin.dll"></f:
09:19:51 ProviderFault></f:Message></f:WSManFault> For more information, see the
09:19:51 about_Remote_Troubleshooting Help topic.
09:19:51 + CategoryInfo : OperationStopped: (2016-AUTO:String) [], PSRemot
09:19:51 ingTransportException
09:19:51 + FullyQualifiedErrorId : JobFailure
09:19:51 + PSComputerName : 2016-AUTO

Thank you so much.
Best Regards,
Xuemin

LISA exists at exception: Provider load failure

We have experiencing some instability when running LISA. The framework would exit with exception "Provider load failure". This exception is raised in state engine which makes the framework exit.

We made attempts to reproduce this issue. Looks like it is reproducible while the reproduce rate is low ( less than 1% in my tests ). We mostly observe this exception after pause/save cases, thus we assume the pause/save operations are related ( which we are not really sure ).

Since LISA has to do the vm operations quite frequently ( start-vm, get-vm etc.), this issue has been a headache for us, we are trying to address this issue somehow ( at least prevent the framework from exiting ).

We still don't have clear idea about how to address this issue, so opening this issue request to discuss. Thank you in advance!

Some excerpt in log:

11:10:13 Info : DoStartSystem( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:13 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A is being started
11:10:14 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from StartSystem to SystemStarting
11:10:14 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:14 Debug: vm.ipv4 = 10.73.74.55
11:10:15 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:19 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:19 Debug: vm.ipv4 = 10.73.74.55
11:10:19 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:22 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:22 Provider load failure
11:10:22 + CategoryInfo : NotSpecified: (:) [Get-VM], VirtualizationExcept
11:10:22 ion
11:10:22 + FullyQualifiedErrorId : Unspecified,Microsoft.HyperV.PowerShell.Commands
11:10:22 .GetVM
11:10:22 + PSComputerName : 2016-Auto
11:10:22
11:10:22 Error: RHEL-6.10-20180301.2-x86_64-GEN2-A SystemStarting entered state without being in a HyperV Running state - disabling VM
11:10:22 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from SystemStarting to ForceShutDown
11:10:22 Debug: vm.ipv4 = 10.73.74.55
11:10:22 Provider load failure
11:10:22 + CategoryInfo : InvalidOperation: (:) [Get-WmiObject], Managemen
11:10:22 tException
11:10:22 + FullyQualifiedErrorId : GetWMIManagementException,Microsoft.PowerShell.C
11:10:22 ommands.GetWmiObjectCommand
11:10:22 + PSComputerName : 2016-Auto
11:10:22
11:10:22 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:26 Info : DoForceShutdown(RHEL-6.10-20180301.2-x86_64-GEN2-A)
11:10:26 Info : GetTestData(done)
11:10:27 Info : SetRunningTime(done)
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A currentTest lasts 0 Hours, 0 Minutes, 15 seconds.
11:10:27 Info : GetCurentSuite(Functional)
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from ForceShutDown to SystemDown
11:10:27 Info : Entering DoSystemDown( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A currentTest updated to done
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from SystemDown to Finished
11:10:27 Info : SaveResultToXML to (Functional,TestResults\cases-20180314-183507)
11:10:27 Info : GetCurentSuite(Functional)
11:10:27 Info : DoStateMachine() exiting

Reproduce script:

$count =0
while ($True) {
$a = Save-VM -Name xxx
if ($? -ne $True) {break}
$a = Start-VM -Name xxx
if ($? -ne $True) {break}
$a =Get-VM -Name xxx
if ($? -ne $True) {break}
$a = Get-VMIntegrationService -VMName xxx
if ($? -ne $True) {break}
$count +=1
$count
}

Error message:

Start-VM : Provider load failure
At line:4 char:1
+ Start-VM -Name xxx
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Start-VM], VirtualizationException
+ FullyQualifiedErrorId : Unspecified,Microsoft.HyperV.PowerShell.Commands.StartVM

Stabilize SR-IOV throughput test cases

Current automation measures the throughput exactly after 1 minute, which sometimes is not reliable enough.
Suggestion is to make an average of the throughput after each operation, in order to better obtain the values, then compare.

implement params for the xml global section

The config-global xml section must be dynamically created as part of running lisa with parameters.
Primarily these values must be handled as lisa params - logfileRootDir (already implemented) and imageStoreDir
@alexngmsft for ack

Initialize-HypervHost -git fails

Calling Initialize-HypervHost.ps1 -git fails for me, likely because & $cmd $options returns right away.
I'm not sure if the git-installer.exe has an option to "not fork", to make sure the ps1 scripts actually waits for git-installer.exe to return.

The following del git-installer.exe does also not work, likely because the file is busy. No error is reported.

Finally the test for git.exe fails because not much was actually installed when the "test -f $path/git.exe" is executed.

Consolidate VSS area test files

STOR_VSS_BackupRestore PS scripts contain mostly duplicate code, with only the specific tests code differences.
Those must be converted into functions into the main STOR_VSS_BackupRestore script as much as possible.

Use OpenSSH instead of Putty

Would it be possible to replace the dependency on putty by implementing OpenSSH? Quick script to set this up:

# install Chocolatey & OpenSHH
Set-ExecutionPolicy Bypass
iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
choco install -y openssh -params "/SSHServerFeature"
refreshenv

lsvmbus tool automation

Feature description:
You can use the python script lsvmbus in /usr/sbin to get information about devices on the Hyper-V virtual machine bus (VMBus) similiar to information commands like lspci.

Automation must validate the functionality of this tool on all supported distributions.

[Feature Request] How about adding rerun failed test case function in the test framework?

Hi Chirs,

Would you like to think about adding rerun failed test cases to lis-test framwork? Even we debug failed test cases continually, still cannot make sure all the test cases run pass 100% in Jenkins jobs.

Locally we run 110+ cases(subset of current lisa-test) on WS2016, 2012R2, 2012, Gen1, Gen2,x86_64 and i386(for rhel6 only) with same xml for every internal formal build. For the host WS2012R2 and 2012, we need to login the host to rerun failed cases, change the case xml file, also after pass, update the test result which is already uploaded to the test cases result manage system- Polarion. Generally it will spend time to rerun and re-update the test result.
After rerun, it will pass, e.g. timesync related test cases, although check the failure log and we can do some enhancement, if we already know what kind of cases can pass after rerun twice, it will great and it could save lots of effort.

We are planing to do re-run the failed cases locally, e.g. parse the cases.xml, select failed cases then rerun, merge test result by local script, but these work cannot be upstream and cannot benefit others. It will be so great if the test framework could support this feature.

Any suggestions for this feature request?

Thank you so much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.