Coder Social home page Coder Social logo

microsoft / service-fabric Goto Github PK

View Code? Open in Web Editor NEW
3.0K 237.0 397.0 62.08 MB

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.

Home Page: https://docs.microsoft.com/en-us/azure/service-fabric/

License: MIT License

Shell 0.14% CMake 0.34% Perl 0.08% C++ 62.13% C 6.35% Objective-C 0.01% HTML 0.01% Awk 0.01% Python 0.01% Dockerfile 0.01% C# 26.55% Batchfile 0.07% Roff 4.26% Makefile 0.01% JavaScript 0.01% PowerShell 0.04% Mathematica 0.01%
cloud-native containers orchestration distributed-systems cloud-computing microservices

service-fabric's Introduction

Service Fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale. Service Fabric runs on Windows and Linux, on any cloud, any datacenter, across geographic regions, or on your laptop.Service Fabric represents the next-generation platform for building and managing these enterprise-class, tier-1, cloud-scale applications running in containers.

Architecture and Subsystem Explorer

Learn about Service Fabric's Core Subsystems, mapped to this repo's folder structure.

Service Fabric release schedule

Here is the upcoming release schedule for Service Fabric runtime versions that we will be supporting starting with version 8.0.

Version Release date
8.0 2021 Mar
8.1 2021 Jul
8.2 2021 Oct
9.0 2022 Apr
9.1 2022 Oct
10.0 2023 Apr
10.1 2023 Nov

Please note that these dates are advanced estimates and might be subject to change or minor adjustments closer to each release.

We will be publishing upcoming features and roadmap items on Azure Updates for Service Fabric.

Repo status

We are in the process to move our development to GitHub. Until then, the Service Fabric team will continue regular feature development internally. We'll be providing frequent updates here and on our team blog as we make progress.

Quick look at our current status

  • Service Fabric build tools for Linux
  • Basic tests for Linux builds available
  • Container image with build tools available to run builds

Providing feedback and filing issues

We have multiple repositories (in addition to this one) that constitute the Service Fabric product. For more information on how to provide feedback and file issues across the different components (and associated repositories), please see Contributing.md.

Build Requirements

The requirements below are based off running clean builds using ninja, with the command

runbuild.sh –c –n

The builds were run on Azure Linux VMs with added disk capacity. If you want to to build on an Azure machine you need to add approximately 70GB for the source+build outputs.

These times should be taken as estimates of how long a build will take.

Machine SKU Cores Memory Build Time
Standard_D8s_v3 8 32GB ~4 hours
Standard_D16s_v3 16 64GB ~2 hours
Standard_D32s_v3 32 128GB ~1 hour

On a smaller VM (Standard_D4s_V3 / 4 cores / 16GB) the build may fail. You may be able to build on a machine with less RAM if you limit the parallelism using the -j switch.

The build also requires approximately 70GB of disk space.

Setting up for build

Get a Linux machine

This is the Linux version of Service Fabric. You need a Linux machine to build this project. If you already have a Linux machine, great! You can get started below. If not, you can get a Linux machine on Azure.

Installing docker

Our build environment depends on Docker. In order to get started you will need to install docker.

There are many ways to install docker. Here is how to install on Ubuntu:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce

Optional: Enable executing docker without sudo

By default docker requires root privelages to run. In order to run docker as a regular user (i.e, not root), you need to add the user to the docker user group:

sudo usermod -aG docker ${USER}
su - ${USER}

You do not need to do this, but note that if you skip this step, you must run all docker commands with sudo.

Build Service Fabric

To start the build inside of a docker container you can clone the repository and run this command from the root directory:

./runbuild.sh

This will do a full build of the project with the output being placed into the out directory. For more options see runbuild.sh -h.

Additionally in order to build and create the installer packages you can pass in the -createinstaller option to the script:

./runbuild.sh -createinstaller

Optional: Build the container locally

If you would prefer to build the container locally, you can run the following script:

sudo ./tools/builddocker.sh

Currently, the build container is based off a base image that includes a few Service Fabric dependencies that have either not yet been open sourced, or must be included due to technical constraints (for example, some .NET files currently only build on Windows, but are required for a Linux build).

This will pull all of the required packages, add Service Fabric internal dependencies, and apply patches.

Troubleshooting: Internet connectivity when installing local docker containers behind a firewall

A common issue with building a docker container behind a firewall is when the firewall blocks the default DNS used by docker. This will manifest as packages failing to download during the docker build step (such as in the builddocker.sh script above).

To fix this, you need to tell Docker to use an alternative DNS server. As a root user, create or edit the Docker daemon's config file at /etc/docker/daemon.json so that it has an entry that looks like this:

{ 
    "dns": ["<my DNS server IP here>", "<my DNS secondary server IP here>"] 
}

Take note to replace the above command with your actual local DNS server, and restart docker:

service docker restart

Testing a local cluster

For more details please refer to Testing using ClusterDeployer.

Running a local cluster

For more details please refer Deploying local cluster from build

Documentation

Service Fabric conceptual and reference documentation is available at docs.microsoft.com/azure/service-fabric. Documentation is also open to your contribution on GitHub at github.com/Microsoft/azure-docs.

Samples

For Service Fabric sample code, check out the Azure Code Sample gallery or go straight to Azure-Samples on GitHub.

Channel 9: Inside Azure Service Fabric

Take a virtual tour with us and meet some of the folks who design and implement service fabric. This Channel 9 YouTube playlist will continue to grow over time with content describing the inner workings of Service Fabric. We have covered most of the subsystems already.

License 

All Service Fabric open source projects are licensed under the MIT License.

Code of Conduct 

All Service Fabric open source projects adopt the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

service-fabric's People

Contributors

abatishchev avatar amanbha avatar amenarde avatar anmolah avatar anthonym avatar athinanthny avatar bpm-ms avatar craftyhouse avatar divyac0403 avatar dkkapur avatar dragav avatar drnic avatar erikadoyle avatar gittorre avatar gkhanna79 avatar jeffj6123 avatar kant avatar microsoft-github-policy-service[bot] avatar mkosieradzki avatar msingiresu avatar peterpogorski avatar samedder avatar sfilatov avatar sridmad avatar sukanyamsft avatar sunil-indoria avatar tomvcassidy avatar v-mankan avatar v-ragaon avatar vturecek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

service-fabric's Issues

[Feature Request]Expose current Epoch for Staful Service Partition

I am preparing for some advanced DR scenarios and I would like to make sure that in case of data loss state across different services is not desynchronized - I need to be able to invalidate state on dependent in case of discrepancy.

I would like to derive some opaque value from for example DataLossNumber and append it to every request. In that way I can detect that a request comes from a different Epoch and I can act accordingly (catch up with the other service or discard request - as coming from past Epoch).

Feature request: reliable collection explorer

Currently, there's no way to view (or modify) reliable collection items.
Considering that a lot of data could potentially be stored in SFRC to replace a data store, this becomes a question of "if this is my data store, why can't I see the data".

Is this something that is planned and if so, what's the functionality it will support?

Question: How to fix the "Not Found" in portal/nodes

image

It seems to be related to NSG, which is blocking the status data communication.
I've tried allowing inbound for 19080/19000, as well outbound. However, I only got it work once for 5 mins. And I cannot get it work anymore.

Any hints?

Cluster Certificate swap - Stuck in Updating state

We have a cluster stuck in limbo after swapping out the primary cluster certificate to a newer certificate
We did these operations (adding secondary and swapping) using ARM in one go,
as described here: https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-security-update-certs-azure#add-a-secondary-certificate-and-swap-it-to-be-the-primary-using-resource-manager-powershell
The New-AzureRmResourceGroupDeployment execution timed out after > 2 hours.
Although the certificate swap seems to have been successful on all nodes, the Azure Portal reports a failed deployment due to the timeout, and
the Cluster is still in 'Updating' mode after over 24 hours later (Cluster is updating user certificate.)
Which means that we are kind of stuck here.

Also, the fabric:/System/UpgradeService is reporting an unhealthy evaluation:
Unhealthy event: SourceId='UpgradeService.Primary', Property='SFRPPoll', HealthState='Warning', ConsiderWarningAsError=false.
Exception encountered: System.Net.Http.HttpRequestException: Response status code does not indicate success: 500 (Internal Server Error).

Any advice is appreciated...

Service Fabric - Error occurred in request., ConnectionError: HTTPConnectionPool(host='ipaddress', port=port): Max retries exceeded with url:

Hello Team,

Whenever i try to deploy apps to service fabric cluster using below command.
i am seeing below exception.

Command - sfctl cluster select --endpoint http://ipaddress:port
Error -
Error occurred in request., ConnectionError: HTTPConnectionPool(host='ipaddrses', port=19000): Max retries exceeded
with url: / (Caused by ProtocolError('Connection aborted.', BadStatusLine('\x99\x00\x00\x00\x00\x00G\x00\x00\x00\x00\x00
\x03\x80 \x00\x00\x1f\x00\x00\x00\x00\x01\x00\x00\x1f\x0cd\xbf!$*\xab\xb4G\x8e\x19v\x88]\xf40\xc6\x08\x87j/?\x01\x80\x0e
\x00\x00\r\x00\x00\x00\x00\x01\x00\x00\x1f\t\x01/?\t\x80\r\x00\x00\x0c\x00\x00\x00\x00\x01\x00\x00\x1fB/?\x00E\x00\x00\x
00\x00\x00\x00\x00\x1f\x8d\x131\x000\x00.\x002\x005\x003\x00.\x001\x008\x006\x00.\x001\x001\x00:\x001\x009\x000\x000\x00
0\x00J\x0c\x03\xe7j\x1d\xbai\x89H\xa4l\xb5C\xbe\xcai\xc7/?',)))
Traceback (most recent call last):
File "c:\python27\lib\site-packages\knack\cli.py", line 125, in invoke
cmd_result = self.invocation.execute(args)
File "c:\python27\lib\site-packages\knack\invocation.py", line 85, in execute
cmd_result = parsed_args.func(params)
File "c:\python27\lib\site-packages\knack\commands.py", line 67, in call
return self.handler(args, **kwargs)
File "c:\python27\lib\site-packages\knack\commands.py", line 123, in _command_handler
result = op(client, **command_args) if client else op(**command_args)
File "c:\python27\lib\site-packages\sfctl\custom_cluster.py", line 94, in select
rest_client.send(rest_client.get('/')).raise_for_status()
File "c:\python27\lib\site-packages\msrest\service_client.py", line 220, in send
raise_with_traceback(ClientRequestError, msg, err)
File "c:\python27\lib\site-packages\msrest\exceptions.py", line 48, in raise_with_traceback
raise error
ClientRequestError: Error occurred in request., ConnectionError: HTTPConnectionPool(host='ipaddrses', port=19000): M
ax retries exceeded with url: / (Caused by ProtocolError('Connection aborted.', BadStatusLine('\x99\x00\x00\x00\x00\x00G
\x00\x00\x00\x00\x00\x03\x80 \x00\x00\x1f\x00\x00\x00\x00\x01\x00\x00\x1f\x0cd\xbf!$
\xab\xb4G\x8e\x19v\x88]\xf40\xc6\x0
8\x87j/?\x01\x80\x0e\x00\x00\r\x00\x00\x00\x00\x01\x00\x00\x1f\t\x01/?\t\x80\r\x00\x00\x0c\x00\x00\x00\x00\x01\x00\x00\x
1fB/?\x00E\x00\x00\x00\x00\x00\x00\x00\x1f\x8d\x131\x000\x00.\x002\x005\x003\x00.\x001\x008\x006\x00.\x001\x001\x00:\x00
1\x009\x000\x000\x000\x00J\x0c\x03\xe7j\x1d\xbai\x89H\xa4l\xb5C\xbe\xcai\xc7/?',)))

In SF Explorer, i am seeing all nodes are in healthy state as well System services like naming service, clustermanagerservice. image store service etc.

when i run above command, i am seeing above error.
In SF Explorer i am seeing nodes are getting down.
i also saw this exception in eventviewer in box.

Faulting application name: FabricHost.exe, version: 5.7.198.9494, time stamp: 0x597913e1
Faulting module name: FabricHost.exe, version: 5.7.198.9494, time stamp: 0x597913e1
Exception code: 0xc0000602
Fault offset: 0x000000000003af68
Faulting process id: 0x15c8
Faulting application start time: 0x01d359ce58708519
Faulting application path: C:\Program Files\Microsoft Service Fabric\bin\FabricHost.exe
Faulting module path: C:\Program Files\Microsoft Service Fabric\bin\FabricHost.exe
Report Id: b2777a7a-c1f3-4014-bd6a-cf0084991764
Faulting package full name:
Faulting package-relative application ID:

After sometime, nodes are becoming healthy but when i try it again. i am seeing same exception.

Thanks,
naresh Khatri

Installing Fabric Cluster results in unhealthy Fault Manager

In the last 24 hours i have tried several times to create a Azure Service Fabric 5 node unsecured cluster. Using the new Burstable Vm's (2cpu 8G ram, 8G HDD), with default options but also the faultanalysis service and DNS service.. And after over an hour of provisioning, the nodes are still in baseline upgrade with an error against the FaultAnalysis service. The first two times i got out of disk space errors. and this last time i have the below error. I initially tried using the Automatic Upgrade which gave me the new v6.0.211.9494 version but i suspected this had faults so tried the oldest v5.7.198.9494 which gives me the below error. ( i have successfully created a similar cluster using the D series machines with a similar spec and had no disk space issues.

Unhealthy event: SourceId='System.RA', Property='ReplicaOpenStatus', HealthState='Warning', ConsiderWarningAsError=false.
Replica had multiple failures in_burst_0 API call: IStatefulServiceReplica.Open(); Error = System.Fabric.FabricException (-2147024784)
An error occurred during this operation.  Please check the trace logs for more details.
System.Runtime.InteropServices.COMException (-2147024784)
There is not enough space on the disk. (Exception from HRESULT: 0x80070070)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Fabric.Data.Log.LogManager.d__16.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.KtlLogManager.d__20.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.KtlLogManager.d__15.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.LoggingReplicator.d__136.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.LoggingReplicator.d__135.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.DynamicStateManager.d__109.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.TransactionalReplicator.d__79.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Replicator.StatefulServiceReplica.d__21.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ServiceFabric.Services.Runtime.StatefulServiceReplicaAdapter.d__18.MoveNext()

Building solution with /t:Package does not include all files for services

Hello, i'm trying to build a solution containing multiple sf apps and various services using msbuild. first of I build all the services like so

msbuild /m /p:"Configuration=AppsDebug;Platform=x64;OutputPath=bin\x64\Debug\;IntermediateOutputPath=obj\x64\Debug\" 

Then I run the package target on only the sf apps:

msbuild /m /p:"Configuration=FabricDebug;Platform=x64;OutputPath=bin\x64\Debug\;IntermediateOutputPath=obj\x64\Debug\" /t:Package 

When this is done there are no errors and the pkg folders are created correctly, however a lot of dlls are missing from the package folder, all the service fabric dlls are included but any other dlls that the service references are not.

if I run the package target directly on the sfproj file, not the solution, everything works, but i'd like to run it on the solution because its a lot faster. is that not supported or am I missing something else?

Add-AzureRmServiceFabricNodeType interrupted during execution left cluster+nodetype in faulty state

Have been trying, to no avail, to fix/remove a node type that seems to be "in limbo". I used the Add-AzureRmServiceFabricNodeType to create a new node type (called "compute") in our production cluster. During execution of the command I lost connectivity on my network.
The command did not fully complete. Now I'm left with a cluster that have the nodetype (I can see it in the Azure portal, as well as in the output from Get-AzureRmServiceFabricCluster) but the node type have no corresponding VM scaleset. In the cluster, the infrastructure service fabric:/System/InfrastructureService/compute is created, but is naturally failing since it is required to run on nodes that match NodeTypeName==compute. Of which there is none.

I have tried to remove the faulty node type using Remove-AzureRmServiceFabricNodeType but the command fail reporting that the specified node type does not exist.

Note: the cluster is Silver-level (with Bronze durability) running on fabric version 5.6.210.9494.

Any advice on how to fix this issue would be greatly appreciated.

I should also note that, in the current state, commands that update the cluster configuration, like adding a admin certificate, also fail.

NodeAgentSFUtility.exe fails with "connection denied" exception when running POA on standalone cluster with windows security.

When InstallAndRunServices.bat runs, I'm seeing a couple of issues:

  1. "logman -ets FabricTraces | findstr /i "24afa313-0d3b-4c7c-b485-1047fd964b60": This seems to always get a hit, causing the logs directory to be deleted. I never see the logs directory come back again, assuming this is not intentional?

  2. These two commands always fails "'logman stop PatchOrchestrationServiceTraces" and "logman delete PatchOrchestrationServiceTraces", possibly leading to the above issue if POA is reinstalled, but unsure about that connection. See SetupLogs.txt below.

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>@echo off

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>setlocal enabledelayedexpansion

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set rootDir=C:

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set currentDir=C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set applicationDir=C:\PatchOrchestrationApplication

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set workingDir=C:\PatchOrchestrationApplication\NodeAgentNTService

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set logsDir=C:\PatchOrchestrationApplication\logs

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set serviceExe=C:\PatchOrchestrationApplication\NodeAgentNTService\NodeAgentNTService.exe

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>set serviceName=POSNodeSvc

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>echo C:\PatchOrchestrationApplication\NodeAgentNTService
C:\PatchOrchestrationApplication\NodeAgentNTService

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>echo C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>IF NOT EXIST C:\PatchOrchestrationApplication (mkdir C:\PatchOrchestrationApplication )

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>IF NOT EXIST C:\PatchOrchestrationApplication\NodeAgentNTService (mkdir C:\PatchOrchestrationApplication\NodeAgentNTService )

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>IF NOT EXIST C:\PatchOrchestrationApplication\logs (mkdir C:\PatchOrchestrationApplication\logs )

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>logman -ets FabricTraces | findstr /i "24afa313-0d3b-4c7c-b485-1047fd964b60" 1>nul

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>if 0 == 0 (
echo "POA provider for coordinator service found in FabricTraces"
logman stop PatchOrchestrationServiceTraces
echo "Stopped trace session PatchOrchestrationServiceTraces"
logman delete PatchOrchestrationServiceTraces
echo "Deleting local traces"
rmdir /s /q C:\PatchOrchestrationApplication\logs
) else (
echo "POA provider not found in FabricTraces. Starting local trace session PatchOrchestrationServiceTraces"
REM Create tracing session for dumping logs locally
logman stop PatchOrchestrationServiceTraces
echo "Stopped trace session PatchOrchestrationServiceTraces"
logman delete PatchOrchestrationServiceTraces
logman create trace PatchOrchestrationServiceTraces -pf logmansessions.cfg -o C:\PatchOrchestrationApplication\logs\PatchOrchestrationServiceTraces.etl -v mmddhhmm -bs 64 -max 100
logman start PatchOrchestrationServiceTraces
if !errorlevel! NEQ 0 exit /b !errorlevel!
)
"POA provider for coordinator service found in FabricTraces"

Error:
Data Collector Set was not found.
"Stopped trace session PatchOrchestrationServiceTraces"

Error:
Data Collector Set was not found.
"Deleting local traces"

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>REM Grant access to logs dir so that NodeAgentService can do cleanup

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>icacls C:\PatchOrchestrationApplication\logs /grant "Network Service":(OI)(CI)F /T
Successfully processed 0 files; Failed processing 0 files

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>if 0 NEQ 0 exit /b 0

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>REM Enable below line if we're not able to copy Settings.xml from NodeAgentService

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>REM icacls C:\PatchOrchestrationApplication\NodeAgentNTService /grant "Network Service":(OI)(CI)F /T

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>REM Stop the service and uninstall the current version

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>sc stop POSNodeSvc
[SC] OpenService FAILED 1060:

The specified service does not exist as an installed service.

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>sc delete POSNodeSvc
[SC] OpenService FAILED 1060:

The specified service does not exist as an installed service.

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>REM Cleanup the C:\PatchOrchestrationApplication\NodeAgentNTService for all predecided folders.

C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4>copy "C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4*.*" "C:\PatchOrchestrationApplication\NodeAgentNTService"
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\CommandProcessor.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\CommandProcessor.pdb
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\InstallAndRunService.bat
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Interop.WUApiLib.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\logmansessions.cfg
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Microsoft.ServiceFabric.Data.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Microsoft.ServiceFabric.Data.Interfaces.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Microsoft.ServiceFabric.Internal.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Microsoft.ServiceFabric.Internal.Strings.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Microsoft.ServiceFabric.Services.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\Newtonsoft.Json.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentNTService.exe
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentNTService.pdb
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentService.exe
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentService.exe.config
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentService.pdb
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentSFUtility.exe
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\NodeAgentSFUtility.pdb
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\ServiceFabricServiceModel.dll
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\SetupEntryPoint.bat
C:\ProgramData\SF\sfqa06\Fabric\work\Applications\PatchOrchestrationApplicationType_App16\NodeAgentServicePkg.Code.1.1.4\SetupLogs.txt

Service fabric named sercvice instance to accept configuration parameters

We ran into a limitation of Service Fabric where we cannot pass additional parameters to the service instances. What i am referring to is this:

<Service Name="ServiceInstance1">
      <StatelessService ServiceTypeName="ServiceType">
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="ServiceInstance2">
      <StatelessService ServiceTypeName="ServiceType">
        <SingletonPartition />
      </StatelessService>
    </Service>

Say, i have a single service type what i would like to create multiple instances of with different configs. I would expect to be able to do something like this:

<Service Name="ServiceInstance1">
      <StatelessService ServiceTypeName="ServiceType">
        <Parameters>
          <Parameter Name="MyConfig" Value="config1.xml" />
        </Parameters>
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="ServiceInstance2">
      <StatelessService ServiceTypeName="ServiceType">
        <Parameters>
          <Parameter Name="MyConfig" Value="config1.xml" />
        </Parameters>
        <SingletonPartition />
      </StatelessService>
    </Service>

The only current option is to register two different service types like ServiceType1 and ServiceType2 and then create corresponding instances. I can hardly see a need to create multiple instances of the same service type unless they have different configurations / parameters (partitioning and instance count is different). My question would be how can we get around not registering a new service type for every existing service type if we want to create an instance with different configuration.

Reverse Proxy - allow overriding X-Forwarded-Host and X-Forwarded-Proto header

Consider following configuration:
Internet Facing Proxy -> SF Reverse Proxy -> Stateful Service

If the Internet Facing Proxy sets X-Forwarded-Host header on SF Reverse Proxy Stateful Service receives two headers - in my opinion SF Reverse Proxy SHOULD NOT add X-Forwarded-Host header if it is provided by the caller.

Both current and proposed implementations might be also some kind of security issue if some implementation underneath trusts X-Forwarded-Host header from the SF Reverse Proxy, in a scenario where SF Reverse Proxy is internet facing (what IMO is a terrible idea).

Current implementation requires inventing new headers for standard situations (including microsoft/service-fabric-issues#112 ).

Open source Service fabric tools

Hello!

Are there any plans to open source the service fabric sdk and build tools? By that I mean the msbuild targets/tasks and the powershell module used for deployment/management. I've started to make some modifications to FabricActutil, however i'd also like to make some changes to the other target files as well as add powershell cmdlets.

Some things i've been looking at/working on

  • Make FabricActutil smarter with regards to exsisting/used parameters
  • Add an option to pack SF applications using symlinks instead of copying (significantly reduces our build times)
  • Enable building even if the default 1/5 node cluster publish profiles are missing
  • Built in way of creating diff packages
  • Probably other stuff

Please do consider it.

Unable to communicate between containers on the same node using Container Host IP address

We have provisioned a Service Fabric Cluster (Win 2016 + Containers) and want to port our legacy applications on this cluster. Due to low latency requirement and the chatty communication between some of our apps, we want these apps to communicate locally. I have used Port binding to expose services hosted within containers. I then tried to use to access the service and I am getting HTTP 404 errors.

I am assuming this is due to WinNAT limitation which has been mentioned in microsoft/service-fabric-issues#227.

Another option which I tried was using the IP Address or Hostname of the container itself for inter-container communication on same host. I could not find any option in ServiceManiest for specifying hostname or ip-address.

Is there any way of specifying the Host Name or IP Address while deploying containers on ASF cluster?

Also, if there is any other recommended strategy then please let me know.

Unable to download container image from private repository

I have a SF (Microsoft.Azure.ServiceFabric.WindowsServer.5.6.220.9494) running the ClusterConfig.Unsecure.DevCluster configuration.

Docker was installed prior and was working and running these same images. Before installed SF I removed all running containers and cleaned up images and private repository logins.

I am trying to deploy an application pointing to my private repo, e.g. myrepo.azurecr.io/sf/myapp

I am able to use the repository credentials specified in the manifest to login to the repo from docker cli.

When SF tries to deploy the container it states:

Error event: SourceId='System.Hosting', Property='Download:1.0:1.0'.
There was an error during download.Failed to download container image myrepo.azurecr.io/sf/myapp

In the admin log I see this sequence of events:

End(BeginDownloadAndActivate): Error=HostingDeploymentInProgress, VersionedServiceTypeId={MyAppType_App10:MyAppPkg:MyAppType,1.0:1.0:131420102687941111}, ActivationContext=551bf757-1e64-45ca-9812-b99f3875df69, ServicePackagePublicActivationId=d87665d6-1421-4c9d-8d36-73442c5d7b80, SequenceNumber=185
DownloadContainerImages returned 0xd00000e5
Failed to import docker image error 0xd00000e5.
EndSendRequest for image history Error 0xd00000e5
DownloadContainerImages returned 0xd00000e5
Failed to import docker image error 0xd00000e5.
EndSendRequest for image history Error 0xd00000e5
80b671e9b3c9184bbd86d2f150c58135:131419843158391101:131419843669317074 failed to send message AddInstance to node 5101db1125ead8d47d6f93321d3eb754:131419843160891129 with error FABRIC_E_TIMEOUT

ServiceManifest

<EntryPoint>
  <ContainerHost>
    <ImageName>myrepo.azurecr.io/sf/myapp</ImageName>
  </ContainerHost>
</EntryPoint>

ApplicationManifest

<Policies>
  <ContainerHostPolicies CodePackageRef="Code">
    <RepositoryCredentials AccountName="myrepo" Password="mysecret" PasswordEncrypted="false" />
  </ContainerHostPolicies>
</Policies>

Also, it appears that dockerd is not running though I saw in a previous debug log that the docker process manager started dockerd successfully, but then exited with error code 1 which the log said was ok. I haven't seen this happen again in the debug log as of yet.

One other thing to note, the image is rather large at about 9gb

Update: I installed docker on another host, logged into the private repo, pulled the image and then installed SF and was able to deploy the same manifests and run the container successfully. If the image does not exist in docker prior to SF trying to pull it then it fails. with the errors above.

dockerd process is not running successfully if the image doesn't exist

Update: I wiped the images docker rmi $(docker images -q) from my local development workstation running Windows Server 2016, using the local dev SF deployment and deploying the application causes docker to download the image from my private repo. I did the same process on the broken 2016 container host but the main difference is that server is running the Core OS so there is no GUI, the docker host that is working is my 2016 development workstation so it's local to Visual Studio.

Is my core container host broke, unsupported, or what? I am mostly at this point interested in how do I find out what's wrong with it so if it's something I did I don't do it again.

Could not load file error using Microsoft.VisualStudio.Azure.Fabric.MSBuild 1.6.3

Using Visual Studio Community 2017 Version 15.4.2 Windows 8.1

Created out of box project from the Service Fabric Application project template with Actor Service with no additional changes and do a rebuild (in Debug or Release x64)

The following error is thrown: "System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.ServiceFabric.Actors, Version=6.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified."

The actor service has this reference (Microsoft.ServiceFabric.Actors, Version=6.0.0.0) in its project and is able to build correctly.

Downgraded the sfproj from Microsoft.VisualStudio.Azure.Fabric.MSBuild from 1.6.3 to 1.6.2 and the sfproj project now compiles correctly. Not sure if this is related to issue microsoft/service-fabric-issues#531

Copy-ServiceFabricApplicationPackage requires re-authorization

Repro

Connect-ServiceFabricCluster -ConnectionEndpoint cluster:19000 -AzureActiveDirectory

# Interactive prompt for authorization occurs

Copy-ServiceFabricApplicationPackage -ApplicationPackagePath C:\applicationpackage -ImageStoreConnectionString fabric:ImageStore -ApplicationPackagePathInImageStore ReactorType 

Expected
Application package is copied to image store.

Actual
Interactive prompt for authorization occurs, again.
If it's authorized, it continues to copy the application package. f it's not authorized (user cancels), or if the PowerShell script is running headless, or if inside the Visual Studio PowerShell Interactive Window; it fails with:

[ERROR] Copy-ServiceFabricApplicationPackage : Exception has been thrown by the target of an invocation.

and with -Verbose:

[VERBOSE] System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.UnauthorizedAccessException: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
   at System.Fabric.Interop.Utility.RunInMTA(Action action)
   at System.Fabric.Management.ImageStore.NativeImageStoreClient.UploadContent(String remoteDestination, String localSource, IImageStoreProgressHandler progressHandler, TimeSpan timeout, CopyFlag imageCopyFlag, Boolean acquireSourceReaderLock)
   at System.Fabric.Management.ImageStore.NativeImageStoreClient.UploadContent(String remoteDestination, String localSource, TimeSpan timeout, CopyFlag imageCopyFlag, Boolean acquireSourceReaderLock)
   --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Fabric.ImageStoreProxy.UploadContent(String sourceFullPath, String destinationRelativePathInImageStore, TimeSpan timeout)
   at System.Fabric.FabricClient.ImageStoreClient.Upload(String sourceFullPath, String destinationRelativePath, Boolean shouldOverwrite, TimeSpan timeout, IImageStoreProgressHandler progressHandler)
   at Microsoft.ServiceFabric.Powershell.CommonCmdletBase.UploadToImageStore(String imageStoreConnectionString, String src, String des, UploadProgressHandler progressHandler)
   at Microsoft.ServiceFabric.Powershell.CopyApplicationPackage.ProcessRecord()

SFRP in inconsistent state after certificate swap

During certificate rollover, if some (not sure which ones) of the conditions are met (not sure which ones):

  1. the new certificate has a different CN than original
  2. swap occurs on primary certificate first
  3. new certificate has later issuance date but old certificate has later expiration date
  4. service fabric endpoint address is not updated to the new CN
  5. original certificate is self signed, new certificate is issued by a proper CA

upgrade fails leaving SFRP resource for cluster in an inconsistent state. The main symptom of this state is:

Error event: SourceId='UpgradeService.Primary', Property='SFRPPoll'.
Exception encountered: System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The request was aborted: Could not create SSL/TLS secure channel.

No further ARM upgrade operation can succeed due to UpgradeServiceDown.

Workaround:
using upgrade service downgrade cluster to a different version and upgrade it back

Because reproduction is highly time consuming it is difficult to say which points: 1,2,3,4,5 are enough for this issue to occur.

/cc: @ChackDan

UPDATE: after restoring back to original certificates UpdateService sometimes stabilizes and allows another upgrade (without changing the cluster version)

GetOrAddAsync of IReliableDictionary throwing when already exists.

This usually doesn't happen, but suddenly did, repeatedly. I thought it was supposed to just get it if it already existed? (thus GetOrAdd).
What is this error caused by, and how can I avoid it?

State provider urn:FooBar/dataStore already exists at Microsoft.ServiceFabric.Replicator.DynamicStateManager.<AddSingleStateProviderAsync>d__166.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at Microsoft.ServiceFabric.Replicator.DynamicStateManager.<GetOrAddStateProviderAsync>d__1021.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at Microsoft.ServiceFabric.Data.ReliableStateManagerImpl.d__191.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n

Improvements in sfproj file format

The CPS-based csproj and vbproj has brought a revolution not only in project system, but also the project file format. For instance:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
  </PropertyGroup>

</Project>

It would be a wonderful improvement if all default target calls are wrapped into a ServiceFabric SDK (with option to override them in infrequent scenarios), so that the template project file will looks like:

<Project Sdk="Microsoft.Azure.ServiceFabric.Sdk">

  <ItemGroup>
    <ProjectReference Include="{path-to-my-csproj}" />
  </ItemGroup>

</Project>

This will:

  • reduce the 47 lines of XML in template to 5-7 lines
  • implicitly include all configs that are under ApplicationPackageRoot, ApplicationParameters, PublishProfiles and Scripts directories.
    • user can override it with a property like DefaultItemExcludes (that new csproj provides) for this purpose.
  • waive the requirement of having a separate packages.config that typically has Microsoft.VisualStudio.Azure.Fabric.MSBuild.

PS: is SF team considering to upstream the MSBuild patches to https://github.com/Microsoft/msbuild, so we are not required to pull Microsoft.VisualStudio.Azure.Fabric.MSBuild separately?

On-prem hardware recommendation, clarify documentation

In reference to this article:
https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-standalone-deployment-preparation

It's surprising to me that the hardware recommendation is 16GB RAM. What is this based on? This seems appropriate in a scenario where reliable collections and stateful services are used, but does not seem appropriate in a scenario where they are not used. For example in my situation, we are only interested in very small stateless owin self hosted services.

Can you provide more detail around how to determine the appropriate hardware?

SF generated user accounts display in login screen

  1. Create and publish an app with Principals>Users defined, e.g. here microsoft/service-fabric-issues#535
  2. RDP in to a node which has the app installed on it.
  3. WF-{randomChars} user will be on login list.

Now it must be possible to hide user accoutns from this list, as you're doing with P_FSSUserffffffff and S_FSSUserffffffff - would be great if you could do this for generated users.

"dotnet msbuild /t:package" doesn't work

Not sure if it was already discussed - I found some similar topics, but still not exactly same.

I got some real challenges mixing Service Fabric and .NET Core solution.
Through Visual Studio and using C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\msbuild.exe myproj.csproj /t:package - everything builds and works fine. However it's not possible to build it on other machines with no VS2017 installed (e.g. CI server), eventhough there is latest .NET Core SDK present.

So the issue is:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\msbuild.exe myproj.csproj /t:package - works
But:
dotnet msbuild myproj.csproj /t:package - doesn't work and throws:

c:\Projects\MyProj\Source\packages\Microsoft.VisualStudio.Azure.Fabric.MSBuild.1.5.0\build\Microsoft.VisualStudio.Azure.Fabric.Application.targets(77,5): error MSB4018: The "ValidatePackageSettings" task failed unexpectedly.
c:\Projects\MyProj\Source\packages\Microsoft.VisualStudio.Azure.Fabric.MSBuild.1.5.0\build\Microsoft.VisualStudio.Azure.Fabric.Application.targets(77,5): error MSB4018: System.IO.FileNotFoundException: Could not load file or assembly 'System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'. The system cannot find the file specified. [c:\Projects\MyProj\Source\MyProj.sfproj]

There is an older Visual Studio 2015 present with msbuild 14.0 on CI server, but it also doesn't work, because it fails to parse .NET Core *.csproj files.

Is it really the only option to install VS2017 on CI server? My first impression was that dotnet msbuild is same as msbuild.exe and self-sufficient without VS2017. Also in Microsoft docs (https://docs.microsoft.com/en-us/visualstudio/msbuild/what-s-new-in-msbuild-15-0) it says - MSBuild is now available as part of the .NET Core SDK and can build .NET Core projects on Windows, macOS, and Linux.

Am I doing something wrong, or is it a known issue? Can it be workarounded somehow? Currently *.sfproj files look a bit outdated comparing to .NET Core projects - which is not bad in general, but it's just hard to mix these two worlds.. I have a feeling that we either need to drop .NET Core, or Service Fabric in our project..

Update:
I also investigated VS2017 Build Tools, which is a separate thing, not needed to have VS2017 installed. It sounds like a good option for CI builds, however this guy doesn't have proper support for .NET Core projects.. (related issue - dotnet/msbuild#1697). So once again - it's hard to mix both worlds.

Reverse Proxy doesn't support dictionary parameters

I have specified controller accepting dictionary as an argument of an action method.

public ActionResult GetSomething([FromQuery] Dictionary<string, string> arg)

When running this application outside of Service Fabric request mentioned below will be successfully handled.

api/contoller/getsomething?arg[0].Key=k1&arg[0].Value=v1&arg[1].Key=k2&arg[1].Value=v2

Passing the same request via Reverse Proxy to the service hosted in Service Fabric I will get 400 FABRIC_E_INVALID_ADDRESS. Seems like Reverse Proxy doesn't support dictionary parameters.

RunAs with Domain Account not working

Hi, I am having issues getting AD runas working on a prod instance with the following manifest. Am i missing anything obvious - the docs aren't clear?

Error:

Error getting user account information for Acme\User: status=5, error=E_ACCESSDENIED

Thanks in advance

<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ApplicationTypeName="AcmeApplicationType" ApplicationTypeVersion="1.0.0" xmlns="http://schemas.microsoft.com/2011/01/fabric">
  <Parameters>
    <Parameter Name="AcmeApplication.HierarchyMetricsService_InstanceCount" DefaultValue="-1" />
    <Parameter Name="AcmeApplication.HierarchyService_InstanceCount" DefaultValue="-1" />
    <Parameter Name="AcmeApplication.MetricsService_InstanceCount" DefaultValue="-1" />
    <Parameter Name="AcmeApplication.Gateway_InstanceCount" DefaultValue="-1" />
    <Parameter Name="AcmeApplication.AcmeOnboardingService_InstanceCount" DefaultValue="-1" />
  </Parameters>
  <!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion 
       should match the Name and Version attributes of the ServiceManifest element defined in the 
       ServiceManifest.xml file. -->
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="AcmeApplication.HierarchyMetricsServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
  </ServiceManifestImport>
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="AcmeApplication.HierarchyServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <RunAsPolicy CodePackageRef="Code" UserRef="SqlUser" />
    </Policies>
  </ServiceManifestImport>
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="AcmeApplication.MetricsServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <RunAsPolicy CodePackageRef="Code" UserRef="SqlUser" />
    </Policies>
  </ServiceManifestImport>
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="AcmeApplication.GatewayPkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
  </ServiceManifestImport>
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="AcmeApplication.AcmeOnboardingServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
  </ServiceManifestImport>
  <DefaultServices>
    <!-- The section below creates instances of service types, when an instance of this 
         application type is created. You can also create one or more instances of service type using the 
         ServiceFabric PowerShell module.
         
         The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
    <Service Name="AcmeApplication.HierarchyMetricsService">
      <StatelessService ServiceTypeName="AcmeApplication.HierarchyMetricsServiceType" InstanceCount="[AcmeApplication.HierarchyMetricsService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="AcmeApplication.HierarchyService">
      <StatelessService ServiceTypeName="AcmeApplication.HierarchyServiceType" InstanceCount="[AcmeApplication.HierarchyService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="AcmeApplication.MetricsService">
      <StatelessService ServiceTypeName="AcmeApplication.MetricsServiceType" InstanceCount="[AcmeApplication.MetricsService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="AcmeApplication.Gateway" ServicePackageActivationMode="ExclusiveProcess">
      <StatelessService ServiceTypeName="AcmeApplication.GatewayType" InstanceCount="[AcmeApplication.Gateway_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
    <Service Name="AcmeApplication.AcmeOnboardingService" ServicePackageActivationMode="ExclusiveProcess">
      <StatelessService ServiceTypeName="AcmeApplication.AcmeOnboardingServiceType" InstanceCount="[AcmeApplication.AcmeOnboardingService_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
  </DefaultServices>
  <Principals>
    <Users>
      <User Name="SqlUser" AccountType="DomainUser" AccountName="Acme\User" Password="MIIBxgYJKoZIhvcNAQcDoIIBtzCCAbMCAQAxggFeMIIBWgIBADBCMC4xLDAqBgNVBAMMI1F1YW50aXVtIFRFU1QgQ2VydGlmaWNhdGUgQXV0aG9yaXR5AhB4BbqpvYuSjU9FRbsijB1wMA0GCSqGSIb3DQEBBzAABIIBAJ2c3DhAPvVqCo60UndvOHuOnutaOpYFKPqLzGTWyEJg61l3PNQNTdeUiYsJ0xaEQR0zFf/1Vxxs4KrUDS/+f1SamhLzhnSrf2QMEVTP0PEFNavQmSfrZ/ItwI5mNzSzsPW0oIBLIpfkXJuAh7ShT8mgxDTU+vUIIxJN2r1kqNVkJIPm6x8mndhCVOjtXlK8xdUD47NxRdaZVFcLRm0qW10VLJRSY2m5BsKrLXxTHm56c1oiE1zOr67HRExMW2hHgRZ2/OBxQz8r+zrOCcbJYmEiCaUdFrsYoQjwtxKX/usBhjlVSad+j3GcN2GRhYEWqslXYPxDGqLxMmy48QIJl2QwTAYJKoZIhvcNAQcBMB0GCWCGSAFlAwQBKgQQGyqD7oTQuqkkM32cQ0wY34AgWsgnN8g3jovyEhykag4BsOvfdKOZFsH9BUuzr8gS9rs=" PasswordEncrypted="true" />
      <User Name="NS" AccountType="NetworkService" />
    </Users>
  </Principals>
  <Policies>
    <SecurityAccessPolicies>
      <SecurityAccessPolicy ResourceRef="Cert1" PrincipalRef="NS" ResourceType="Certificate" />
      <SecurityAccessPolicy ResourceRef="Cert1" PrincipalRef="SqlUser" ResourceType="Certificate" />
    </SecurityAccessPolicies>
  </Policies>
  <Certificates>
    <SecretsCertificate X509FindValue="‎d3 ee 30 35 87 08 8b 1b 00 b1 c5 1e c8 cc f0 9e 09 e7 65 db" Name="Cert1" />
  </Certificates>
</ApplicationManifest>

EndpointResourceDescription.Protocol flexibility

EndpointResourceDescription.Protocol is of enum type EndpointProtocol, which only supports Http, Https, and Tcp.

If someone wants to use UDP, or anything else, it can't be specified in the EndpointProtocol.

I'm presently modifying @Suremaker's RemoteExecution library to support Service Fabric, among other things. I would like to integrate his protocol resolving with Service Fabric's endpoint description protocol.

Here I would much rather have "{description.Protocol}" than "net":
https://github.com/Jm-Fox/RemoteExecution-Net/blob/master/RemoteExecution.ServiceFabric/CommonHelper.cs#L20

Long story short, it would be great if EndpointResourceDescription.Protocol to be of type string, not enum.

Debugging remote service fabric applications with equal service names

If several applications were deployed in service fabric and names of some services are equal then it’s not possible to remote debug such services.
Attach to process dialog don’t show separate instances of service with same name and they displays as one record. When trying to attach to such service “attach to process” dialog closes and no attaching progress displays.

Enhancement: Show console error output on :19080 monitor page.

The <SetupEntryPoint> and <EntryPoint> may encounter some error which cannot be logged by application, which happens before application initialize logging component. They are only captured by Service Fabric and written to console logs if enabled. While we can remote desktop to the VM and find the log, it is really time consuming on slow connection, only to find small errors. This is especially annoying when you have many teams deploying to the same cluster; many people need to remote to the VMs.

It is far better if the monitor page simply show the reason (some piece of the std err output), other than only process return code. Or provide a download link for the console logs.

Problems deploying large Service Fabric application (34 services)

The application we're developing is rather large, containing 34 individual services (the vast majority of which are actor services) and we're having some trouble deploying, upgrading and even deleting the applicaiton (The application is being deployed to a five node cluster with some considerably smaller applications already deployed).

Deployment kept timing out - and so we had to modify the SDK to pass a timeout parameter, but the application deployed eventually. We then tried to upgrade the application, but upgrading seemed to stick on the first upgrade domain.

After waiting a while, we attempted to delete the application instance from the web control panel but this kept timing out. We then attempted to delete it using PowerShell with the -ForceRemove flag and a -TimeoutSec argument of ten hours, but the operaiton timed out overnight! So now the cluster is stuck with an application where most of the services are in an 'unknown' state, and which is seemingly impossible to delete.

Has anybody else attempted to deploy an application of this size to Service Fabric? Or encountered any of the issues outlined here?

EDIT: Upon futher investigation, I noticed that only the stateful services haven't yet been deleted. I tested deploying to my local 1 node cluster and attempting to delete in the same way, but that succeeded without any problems.

Lots of InvalidOperationException: Transaction is committing or rolling back

This is related to this stack overflow question (https://stackoverflow.com/questions/45960452/how-to-find-out-why-debug-shows-many-exceptions-in-service-fabric)

Using the latest libraries and local dev cluster 5.7.198.9494 in both 1 and 5 node configurations I'm seeing an InvalidOperationException occur which gives the reason: Transaction NNNNNNN is committing or rolling back or has already committed or rolled back

This doesn't go back up to my code, but makes for a noisy output window and makes me wonder what is going on. I guess it is perfectly safe, but it would be nice to understand what is going on and why - and hopefully prevent it happening!

Update
I started tracking FirstChanceExceptions and found that it is also happening in production.

Another update
Easy to reproduce - create a new service fabric application with a stateful service using the latest VS2017 and SDK. Start debugging and stop on Invalid operation exception - one occurs every time the transaction is disposed.

FabricClient.ApplicationManager.CopyApplicationPackage overloads

All the long running methods of Applications have async versions except CopyApplicationPackage witch feels a little bit strange, please consider adding an async version of this method.

Also this method takes a Image store parameter witch has to be set to a particular value in the cluster, I can understand being there for compat reasons but there should be an overload that does not requiring this parameter and get the value for it internally. Other methods like ProvisionApplicationAsync does not require this parameter.

This would also apply to RemoveApplicationPackage

Allow singleton stateless services to have multiple instances per node bypassing placement constraint

I've posted this as an Idea in UserVoice, but I think this is a bug, so I will post it here as well.
https://feedback.azure.com/forums/293901-service-fabric/suggestions/19664074-allow-singleton-services-to-have-multiple-instance

Currently, service fabric does not allow you to have multiple instances of the same service in the same node to avoid data loss. But for stateless services it is not the case, as we do not have data to loose neither replication, the orchestrator should know that if my stateless service has more instances than the node count, it should be smart enough to split these instances around the nodes without warning about it.

Unhealthy event: SourceId='System.PLB', Property='ServiceReplicaUnplacedHealth_Secondary_ba654c6a-3e83-4560-9ea1-9fadbe7b115d', HealthState='Warning', ConsiderWarningAsError=false. The Load Balancer was unable to find a placement for one or more of the Service's Replicas: Secondary replica could not be placed due to the following constraints and properties: TargetReplicaSetSize: 9 Placement Constraint: N/A Parent Service: N/A Constraint Elimination Sequence: Existing Secondary Replicas eliminated 3 possible node(s) for placement -- 0/3 node(s) remain.

Feature request: Streaming blob reliable collection

It would be very useful to have an IReliableStreamDictionary.

Such string-keyed collection should allow inserting, removing, replacing and reading blob streams. In the minimum scenario such collection should be replicated on the key-level and can have entry size limited according to the replication traffic limits. In an optimal scenario it is replicated on the block-level.

Why is it better than IReliableDictionary<string, Blob>?
Because it eliminates serialization overhead and the need to read entire blob into memory to do anything with it.

Why is it better than external blob storage e.g. Azure Blob Storage?
Because it is automatically consistent with other collections in a specific partition. It is easy to maintain partition-consistent backups and also easily handle consistent geo-replication scenarios using the SF Backup feature.

Other benefits includes: simple coding model and developer experience, potential performance gains, etc.

ServicePartitionClient.InvokeWithRetryAsync retries one extra time

If you create a ServicePartitionClient and provide it with with retry settings:

var partitionClient = new ServicePartitionClient(
communicationClientFactory: factory,
serviceUri: new Uri("fabric:/Application/Service"),
partitionKey: new ServicePartitionKey(),
retrySettings: new OperationRetrySettings(TimeSpan.FromSeconds(2), TimeSpan.FromSeconds(2), 5)
);

await partitionClient.InvokeWithRetryAsync(async client => ...);
setting up to retry 5 times, you would expect it to execute once and retry 5 times for a total of 6 calls, but it calls it 7 times instead.

Reverse proxy and PatchOrchestration

I use RP but I want to install the Patch Orchestration application. Looking here it says it opens an http listener to view results of attempted installs. Is it possible to secure this or disable it since RP allows access to all http services.

I thought there was an issue requesting a feature to select which services are accessible through RP via configuration -- can't find it though! This would be a great use case. Can't ever think of a reason why non-cluster-maintainers would ever need this information!

Thanks

Provide validation to require application parameter values to be set at application creation time (Was: Unable to delete an Application with a stateful service crashing at startup)

Hi all,

We have a SF app with 5 services, somehow, we were able to deploy the application without facing a rollback, but the configuration of the application is invalid, causing one of the service to crash at start-up.

We tried to delete the application completely, but it is now stuck with a Service Status of "Deleting" for several minutes/hours, with only the crashing service remaining. The service/app are in Warning state with the following error:

DeployedServicePackages Warning Unhealthy deployed service packages: 100% (1/1).
    DeployedServicePackage Warning Unhealthy deployed service package: ApplicationName='fabric:/AppName.App.Dev', ServiceManifestName='AppName.SampleServicePkg', NodeName='_primary_0', AggregatedHealthState='Warning'.
        Event Warning Unhealthy event: SourceId='System.Hosting', Property='CodePackageActivation:Code:EntryPoint', HealthState='Warning', ConsiderWarningAsError=false. There was an error during CodePackage activation. The service host terminated with exit code:1

After some time, it causes the Primary Replicat of fabric:/System/NamingService to hit the following Warning health state:

Partitions Warning Unhealthy partitions: 33% (1/3), MaxPercentUnhealthyPartitionsPerService=0%.
    Partition Warning Unhealthy partition: PartitionId='00000000-0000-0000-0000-000000001000', AggregatedHealthState='Warning'.
        Replicas Warning Unhealthy replicas: 33% (1/3), MaxPercentUnhealthyReplicasPerPartition=0%.
            Replica Warning Unhealthy replica: PartitionId='00000000-0000-0000-0000-000000001000', ReplicaOrInstanceId='131263110875599804', AggregatedHealthState='Warning'.
                Event Warning Unhealthy event: SourceId='System.NamingService', Property='Duration_PrimaryRecovery', HealthState='Warning', ConsiderWarningAsError=false. The PrimaryRecovery started at 2016-12-21 16:26:05.265 is taking longer than 30:00.000.

Observations:

  • All services in the app seem to be properly deleted and are not running anymore, but the remaining Service is still running on the node when connecting on the node through Remote Desktop.
  • Killing the service process will not help, the process will simply be restarted with the Service Status still stuck at "Deleting".
  • Doing a Restart of all Nodes from the SF Explorer will not help.
  • We went through the logs searching for deleting/delete but didn't see anything of interest (needle in haystack, logging level is targeting SF platform developers and not platform end-users).
  • When something goes wrong in SF, we are most of the time in complete darkness.

Cluster Information:

  • Code Version: 5.4.145.9494
  • Nodes: 3
  • Node VM SKU: Standard_A2_v2

Any help with my investigation is welcomed.
Thanks,
Simon

[Feature Request] Consider adding Merge operation for Reliable Dictionary

RocksDB has an awesome feature called Merge Operator: https://github.com/facebook/rocksdb/wiki/Merge-Operator

It allows to register an operator that will perform merge operation. Such operation is ultra efficient, nicely replicated (decreases replication traffic). It also decreases lock contention and roundtrips when used from secondary replicas.

Special case for such operator are CRDTs that may have always-succeed guarantee and can be used lock-free.

IMO it would be great to have such a facility in Service Fabric Reliable Dictionaries.

API: I think it merge operator should be registered similar way as custom serializer. It should also operate on a similar level as serializer: i.e. byte-stream. It might be a nice alternative to AFAIK not-implemented yet incremental serialization.

ServiceFabric fails to start FabricHostSvc in development environment

Environment:

  • visual_studio_professional_2017_version_15.1_x86_x64_10254675 + MicrosoftAzure-ServiceFabric-CoreSDK
  • Windows 7

Unable to start test project in Visual Studio running. Fails with the following error:


PS C:\WINDOWS\system32> cd 'C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup'
PS C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup> .\DevClusterSetup.ps1

Using Cluster Data Root: C:\SfDevCluster\Data
Using Cluster Log Root: C:\SfDevCluster\Log

Create node configuration succeeded
Starting service FabricHostSvc. This may take a few minutes...
Start-Service : Failed to start service 'Microsoft Service Fabric Host Service (FabricHostSvc)'.
At C:\Program Files\Microsoft SDKs\Service Fabric\Tools\Scripts\ClusterSetupUtilities.psm1:453 char:5

  • Start-Service FabricHostSvc -WarningAction SilentlyContinue
    
  • ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
      ServiceCommandException
    • FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

StartLocalCluster : Could not start FabricHostSvc
At C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup\DevClusterSetup.ps1:73 char:1

  • StartLocalCluster
  •   + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
      + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,StartLocalCluster
    
    

Feature Request: reverse proxy option to assume ResourceNotFound by default

With ASP.NET Core, we are using Kestrel exclusively. However, the assumption of port sharing by reverse proxy requires every microservice to add extra header X-ServiceFabric : ResourceNotFound. This is not microservice friendly, and we are caught every so often.

I propose an option/setting to switch the default behavior.

Containers: show progress when downloading from a registry

Downloading images from a container registry can take a long time (it may even produce an error health event when the service instances don't go up quickly enough). Currently there's no indication what's going on except for the status "Downloading" on the deployed service package object. It would be helpful to show progress (as percentage or preferably bytes).

Service Fabric AppTypes accross multiple solutions

Hi,

Hope you all are fine. I was facing a issue with Service Fabric that some of you might have faced as well. Right now, the project that I'm working on is structured so that all the micro-services are in there own solutions. Working with Service Fabric, that doesn't seem possible since all AppTypes should be in the same solution as the App.

Any ideas if this can be done? I'm wanting this because if all the projects are in a single solution, that means more loading time for the solution, more build time, larger repo download and etc.

Looking forward to your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.