Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for PowerShell hang on exit #5356

Merged
merged 5 commits into from
Nov 14, 2017
Merged

Fix for PowerShell hang on exit #5356

merged 5 commits into from
Nov 14, 2017

Conversation

PaulHigin
Copy link
Contributor

This change addresses the issue #5205

Repro steps are:

$rs = [runspacefactory]::CreateRunspacePool(1,5)
$rs.Open()

$ps1 = [powershell]::Create()
$ps1.RunspacePool = $rs
$null = $ps1.AddScript({sleep 31; 1+1})
$ps1.BeginInvoke()

$ps2 = [powershell]::Create()
$ps2.RunspacePool = $rs
$null = $ps2.AddScript({sleep 32; 1+1})
$ps2.BeginInvoke()

$ps3 = [powershell]::Create()
$ps3.RunspacePool = $rs
$null = $ps3.AddScript({sleep 33; 1+1})
$ps3.BeginInvoke()

exit #hangs

The problem is that the latest version of CoreCLR doesn't seem to be calling finalizers (or the finalizer thread dies for some reason and finalizers are not getting called).

PowerShell relies on the CLR finalizer to clean up state on exit. In this case a runspace pool was not closed or disposed and any pipeline worker threads created to run concurrent scripts won't end, causing the hang. The same thing can happen if any individual runspace is created to run concurrent script and is not closed.

PowerShell really needs to rely on CLR finalizer to clean up because it can be hosted in many ways and it cannot know when a session is being intentionally ended or some helper session is being closed.

I'll contact the DotNet CoreCLR team to see if this is a known issue.

But in the mean time I have a work around to prevent the hang. The workaround uses existing code that tries to determine when a session is ending and stop transcription. This code has a bug and I have rewritten it to use a new Runspace.IsInteractiveHost property. When this is set to true in indicates to the runspace clean up code that the entire PowerShell session is ending and that any remaining active runspaces should be closed (along with pipeline threads). This way any application hosting PowerShell can use the property to ensure full runspace cleanup on exit.

@PaulHigin
Copy link
Contributor Author

FYI I have created an Issue about finalizers not being called for the CLR team(https://github.com/dotnet/coreclr/issues/14889).

@PaulHigin
Copy link
Contributor Author

It turns out that this "CLR no longer calling finalizer on shutdown" is by design. Making it work across multiple platforms is probably much harder than making it work just on Windows.

I am going to think about this some more so don't bother reviewing this just yet. I will look into using the AppDomain.DomainUnload.Unload event. I suspect it is no more reliable than the finalizer thread, since it is just another thread for clean up, after all. But we can use it as notification for process shut down.

@daxian-dbw daxian-dbw changed the title Workaround for PowerShell hang on exit [On Hold] Workaround for PowerShell hang on exit Nov 7, 2017
@SteveL-MSFT SteveL-MSFT modified the milestones: 6.0.0-RC, 6.0.0-HighPriority Nov 9, 2017
@PaulHigin PaulHigin changed the title [On Hold] Workaround for PowerShell hang on exit Fix for PowerShell hang on exit Nov 13, 2017
@PaulHigin
Copy link
Contributor Author

I have removed the "On Hold" and "Workaround" designations from the title. After talking with the CLR team, we can no longer rely on CLR events or finalizer calls on exit. The AppDomain.DomainUnload is not supported. The AppDomain.ProcessExit event is not helpful since it is only called during application exit which means threads already have to be cleaned up, plus there is a time limit.

So we will need to do our own clean up in this change.
I have introduced a new Runspace property (IsInteractiveHost) to indicate that runspace clean up must be performed when the interactive host runspace closes. I made this public so that other hosts can use it as needed.

/// A runspace designated as interactive host means that it is the primary runspace, and when it closes
/// any other open runspace on the system should be forcibly closed.
/// </summary>
public bool IsInteractiveHost { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to Paul offline about this property -- how to make sure this property is set only once so that there won't be a misuse of the property.

One feedback is to make it internal if we want to get this fix in for RC release, so that any follow-up work on this matter between RC and GA won't be a breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the name of the property perhaps can be changed to IsDefaultRunspace, without the Host in it, so there is less confusion about Runspace vs. Host.

PSHostUserInterface host = executionContext.EngineHostInterface.UI;
if (host != null)
// Stop all transcriptions and unitialize AMSI if we're the last runspace to exit or we are an exiting interactive host.
if (this.IsInteractiveHost || !haveOpenRunspaces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this check should be if (!haveOpenRunspaces). When closing the default runspace, it's possible there are other runspaces still running code, which may be calling into AmsiUtils, so calling AmsiUtils.Uninitialize() may cause unexpected error.
AmsiUtils.Uninitialize() and StopAllTranscribing() will called anyway when the last runspace is being closed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just my thought, I didn't verify the scenario of calling AmsiUtils.Uninitialize() while having another runspace running.

@@ -897,6 +889,18 @@ private void DoCloseHelper()
//Raise Event
RaiseRunspaceStateEvents();

// Close all open runspaces to ensure that any associated pipeline thread is released.
if (closeAllOpenRunspaces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about ISE here. Each tab is an interactive host with a different Runspace, so which one should be the default Runspace?
A user can type ctrl+t and ctrl+w to create a new tab and close an existing tab, so it's very common that the first tab gets closed by ctrl+w later. If the first tab's Runspace is set to be the default, then closing that tab would terminate all tabs ...

@PaulHigin
Copy link
Contributor Author

Per offline discussion with Dongbo I will make some changes. We should not use the term "Default Runspace" because it has a current meaning for a runspace per thread that is available to use for running scriptblocks. This clean up code is meant to be used only once per process session and is intended to prevent process hang on exit due to dangling threads (because runspaces are not cleaned up).

Therefore this property should be called "PrimaryRunspace" and should be static. When the PrimaryRunspace is closing it means that the PowerShell session is ending and on exit clean should be performed. I will make it internal for now but feel it should eventually be public so that other hosters of PowerShell can use it. The purpose of the property is to do on exit clean up.

I'll submit a new commit with these changes.

{
throw new PSInvalidOperationException(RunspaceStrings.PrimaryRunspaceAlreadySet);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A race condition may happen when setting this property from two threads. How about using Interlocked.CompareExchange<Runspace>? For example:

set
{
    var original = Interlocked.CompareExchange<Runspace>(ref s_primaryRunspace, value, null);
    if (original != null)
    {
        throw new PSInvalidOperationException(RunspaceStrings.PrimaryRunspaceAlreadySet);
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A race condition under a lock? How can that happen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, a lock ... I missed that. I guess I came too early this morning :)

PSHostUserInterface host = executionContext.EngineHostInterface.UI;
if (host != null)
// Stop all transcriptions and unitialize AMSI if we're the last runspace to exit or we are exiting the primary runspace.
if (isPrimaryRunspace || !haveOpenRunspaces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please respond to my previous comment here?

Perhaps this check should be if (!haveOpenRunspaces). When closing the default runspace, it's possible there are other runspaces still running code, which may be calling into AmsiUtils, so calling AmsiUtils.Uninitialize() may cause unexpected error.
AmsiUtils.Uninitialize() and StopAllTranscribing() will called anyway when the last runspace is being closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Will just use haveOpenRunspaces.


// Report telemetry if we have no more open runspaces.
// TODO: Is this still needed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed. The old telemetry code is kept as references for the later rewriting work. /cc @JamesWTruher

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the comment.

@@ -897,8 +890,19 @@ private void DoCloseHelper()
//Raise Event
RaiseRunspaceStateEvents();

// Report telemetry if we have no more open runspaces.
if (closeAllOpenRunspaces)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution won't work well with ISE-like applications. Maybe we should open an issue to track it.

I'm thinking about ISE here. Each tab is an interactive host with a different Runspace, so which one should be the default Runspace?
A user can type ctrl+t and ctrl+w to create a new tab and close an existing tab, so it's very common that the first tab gets closed by ctrl+w later. If the first tab's Runspace is set to be the default, then closing that tab would terminate all tabs ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if there is no primary runspace then there is a problem with auto clean up. In this case the static variable should not be used and the application needs to do its own cleanup on exit. I don't think an issue needs to be created since this is just being used internally right now for PowerShell shell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an issue needs to be created since this is just being used internally right now for PowerShell shell.

Agreed. #Closed

Copy link
Member

@daxian-dbw daxian-dbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Are you going to add tests for this fix?

@daxian-dbw
Copy link
Member

Merging this PR. I will add the test to this fix soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants